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CORRELATES OF DAYDREAMING: 
A DIMENSION OF SELF-AWARENESS + 


JEROME L. SINGER axp ROSALEA A. SCHONBAR 


Teachers College, Columbia University 


Despite the fact that most writers on per- 
sonality theory and psychopathology discuss 
daydreaming, the highly personal and ephem- 
eral quality of conscious fantasy has posed 
baffling problems to the investigator who seeks 
to formulate operational tests of the various 
theoretical notions in the field. The investiga- 
tion to be described here represents one phase 
of a general program of research designed to 
explore the functional role of daydreaming or 
fantasy behavior in the organization of per- 
sonality. Since much of the theory and em- 
pirical knowledge concerning daydreaming de- 
rives from the observations of individual cli- 
nicians with relatively limited subject samples, 
and under highly specialized conditions (ps 7- 
choanalysis or examination of psychiatric pa- 
tients), it was felt desirable to, approach the 
problem from a somewhat different point of 
view. For one thing, relatively little is known 
as yet concerning the actual range and vari- 
ability of daydreaming tendencies in the nor- 
mal population. There is, furthermore, little 
systematic knowledge of the relationship of 
daydreaming tendencies to other personality 
characteristics or to certain crucial dimen- 
sions of behavioral variations in presumably 
normal individuals. While a variety of studies 
with thematic apperception type of material 
have provided useful techniques for scrutiniz- 
ing patterns of such fantasy needs as achieve- 
ment and aggression, this research has not 
primarily been concerned with the more gen- 
eral role of a capacity for daydreaming. 

A synthesis of theoretical formulations and 
some empirical observations by writers such 


1 This study was supported under Public Health 
Service NIMH Grant M-2279. The authors are in- 
debted to Vivian McCraven, Judith Antrobus, and 
John Antrobus, who assisted by acting as raters and 
in various scoring and computational procedures. 


as Freud, Sullivan, Mead, and Lewin have 
suggested a view of daydreaming that has 
served as a basis for some of the tentative 
hypotheses of the study. The capacity to en- 
gage in daydreaming is, to some extent, a 
learned response which develops differentially 
as a function of certain patterns of parent- 
child relationships. Of particular significance 
in its development appears to be the oppor- 
tunity for identification with a benign pa- 
rental figure under circumstances in which 
intermittent reinforcement for the child’s con- 
trol of overt gratification seeking movements 
occurs. To some extent, mothers in our so- 
ciety tend to represent inhibition of impulses 
and also to foster aesthetic interest, while fa- 
thers represent action tendencies and the ex- 
ternal environment. Closer identification with 
a mother figure would therefore appear par- 
ticularly to be related to introspective tend- 
encies. 

The mode of translation of checked body 
movement into a capacity for instituting 
movement on an imaginal level is difficult to 
explain; Werner’s Sensory-Tonic Theory pro- 
vides the most specific approach to the prob- 
lem. With reinforcement both by parental fig- 
ures and by the general socioeconomic condi- 
tions or sociocultural milieu, fantasy or resort 
to verbal or imaginal means of dealing with 
delays becomes an increasingly differentiated 
ability which provides additional benefits, 
since it frees a person from dependence on 
the immediate perceptual situation and af- 
fords a fluid medium in which trial actions 
can occur with impunity. In adults, under 
optimal conditions, a differentiated capacity 
to engage in daydreaming may make it pos- 
sible for the individual to increase his aware- 
ness of self-other relationships, of his own 
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action tendencies seen in time perspective, 
and it may enhance the possibility of a po- 
tentially greater repertory of role relation- 
ships through imaginal practice. Pathological 
extremes in this personality dimension may 
involve either excessive resort to fantasy with 
consequent paralysis of fruitful motor ex- 
ploration of the environment or failure to de- 
velop fantasy tendencies (as apparently has 
occurred in certain institution reared chil- 
dren), with consequent inability to delay mo- 
tor responses and much self-defeating or de- 
structive motility. The question as to the op- 
timal degree or type of daydreaming remains 
as yet an unexplored area from the stand- 
point of empirical research. 

To the extent that Rorschach human move- 
ment (M) responses may represent tendencies 
to engage in daydreaming (Singer, 1955), 
support for the notion that daydreaming 
tendencies are associated with motor inhibi- 
tion, planfulness, and parental identification 
has come in a number of studies (King, 
1958; Shatin, 1953; Singer & Sugarman, 
1955; Singer, Wilensky, & McCraven, 1956). 
The present investigation represents an effort 
to move beyond the inferences concerning 
Rorschach M responses to a more direct study 
of daydreaming and fantasy tendencies. In 
a somewhat similar effort, Page (1957) re- 
cently reported a relationship between a ques- 
tionnaire derived daydream score and M. 


HYPOTHESES 


The general hypothesis of this study is that 
subjects (Ss) who indicate a greater fre- 
quency of daydream behavior are also char- 
acterized by greater reported frequency of 
night dreams, social introversion, and crea- 
tivity in their spontaneous reports of day- 
dreams or storytelling activity. They are, in 
addition, more likely to be identifed with 
their mothers (on the basis of measures of 
assumed similarity of interests); those who 
report less daydreaming, on the other hand, 
are expected to show greater evidence of re- 
pression or denial of problems and a lesser 
tendency toward identification with their 
mothers. The inclusion of a form of manifest 
anxiety scale (Welsh’s A scale) in the bat- 
tery of procedures was carried out with the 
interest of exploring the possibility of an em- 


t 
if 


pirical linkage between daydreaming and anx- 
iety. While it was felt that clinical evidence 
suggests, generally speaking, a dampening of 
imaginative behavior during attacks of free- 
floating anxiety, it was considered likely that 


_ the type of behavior reported on the A scale 


might to some extent represent willingness to 
adopt a self-scrutinizing attitude or to admit 
complaints, rather than serving as an indi- 
cator of gross differences in anxiety. 

In effect, then, the conception that is ex- 
amined empirically in this paper is that one 
of the dimensions along which people vary in- 
volves the tendency or capacity to see them- 
selves in a temporal or spatial perspective an 
to engage in some form of imaginal livn 
Operationally, such a tendency or personaliti) 
style is manifested by relative willingness to 
respond to questionnaire materials of a pet 
sonal sort, ability to admit a variety of in- 
ternal ideational activities, and greater will- 
ingness or ability to provide creative thematic 
material to ambiguously structured stimuli. 


SUBJECTS AND PROCEDURE 


For the preliminary investigation described here, a 
group of 44 women, graduate students in education, 
served as Ss. The Ss consisted of Negro and white 
women, married and single, most of whom were 
teachers. The major group breakdown employed for 
this study was on the basis of the median score of 
a questionnaire of daydreaming frequency. No sig- 
nificant differences emerged between the High and 
Low Daydream groups in age, years of education: 
marital status, white-Negro ethnic groups, or socio- 
economic background. f. 

The following procedures were employed to obtain 
relevant measures from Ss along the hypothesize 
dimensions: 

Daydream Questionnaire. A detailed inventory con- 
cerning the patterns of daydreaming and the fre- 
quency of occurrence of specific daydreams was de- 
veloped.? The phase of the inventory employed 3? 
the present investigation consisted of a series of 
specific daydreams, Ss were required to indicate 9” 
a five-point scale from Very Frequently to Practically 
Never the relative frequency with which they e 
Perienced each daydream. A total score was derive 
for each S$ based on her self-weighted responses t 


2A copy of the questionnaire has been deposited 
with the American Documentation Institute. Order 
Document No. 6466 from ADI Auxiliary Publications 
Project, Photoduplication Service, Library of Ces 
gress; Washington 25, D. C., remitting in advan? 
$2.00 for microfilm or $3.75 for photocopies. Maki 
checks payable to: Chief, Photoduplication Service, 
Library of Congress. 
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each item. This Daydream score (ranging theoreti- 
‘cally from 93 to 558) served as the basis for divid- 
ing Ss into High and Low Daydream groups. A cut 
at the median score (173) was employed. It should 
be noted that the internal consistency of the Day- 
dream score was quite high, with Cronbach’s alpha 
yielding a coefficient of .96 for a group of 240 Ss. 
Frequency of Night Dreams. Each S kept a log of 


her night dreams ovér a period of 1 month. The ' 


score employed here was Dream Frequency, the num- 
ber of separate nights during this period that S re- 
membered at least one reportable dream. 

Welsh’s Repression Scale—(R). On the basis of an 
extensive reanalysis of Minnesota Multiphasic Per- 
sonality Inventory responses, as well as considerable 
subsequent study, Welsh (1956) developed a scoring 
scale for MMPI which he terms the R scale. Items 
on this scale seem best characterized as reflecting for 
high scorers tendencies toward denial or repression, 
and for low scorers externalized or acting-out be- 
havior. Most scorable items are answered false for 
this scale, but Welsh’s evidence argues against a sim- 

le response set. 

p Welsh’s Anxiety Scale—(A). The A scale, derived 
similarly by Welsh, consists of items from the MMPI 
in which “disability of a dysthymic and dysphoric 
nature” with anxiety is most prominent. According 
to Welsh’s further study of profiles from diagnostic 
groups, anxiety states fall high on A, but for Ss who 
score high on both A and R, depression is a primary 
symptom; those Ss who score high on A and low on 

izoid features. 

air T Scale. The 15 Lie items from the MMPI 
were included as an additional measure of denial 
tendencies and to provide some indication of the ex- 
tent to which the responses to the daydream ques- 
tionnaire might be subject to conscious falsification. 

Social Introversion—(Si). A scale for social intro- 
version was derived from the MMPI by Drake 
(1956). Correlations with another measure of intro- 
version were in the .70’s for both men and women 
college students; in addition, the mean for those stu- 
dents engaging in more college activities than the 
average student showed significantly less introversion 
than the mean of those participating less than the 
average amount. 

Parental Identification Patterns. As one approach 
to the issue of similarity to parents, a questionnaire 
and procedure derived from a study by Oliner (1958) 
was employed. This questionnaire consists of 44 
items dealing with a variety of interests and ac- 
tivity patterns to which Ss indicate their reactions 
on a four-point scale from “very much like” to “very 
much dislike.” These items were responded to ini- 
tially by each S for herself, after which instructions 
called for responding to the questionnaire as “Person 
I would like to be,” and then as the items applied 
to “Mother” and to “Father.” To evaluate the rela- 
tive perceived similarity of self to mother as against 
father, a score based on the formula (Self-Father)— 
(Self-Mother) was derived. A high score on this 
variable indicates that S reported the difference be- 
tween her own interests and those of her father to 
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be greater than the difference between her own in- 
terests and those of her mother. A positive correla- 
tion’ was therefore hypothesized between the (S-F)- 
(S-M) score and degree of fantasy, such that the 
greater the perceived similarity to mother rather 
than to father, the greater the fantasy tendency. A 
score based on the absolute difference between per- 
ceived interests of fathers and mothers (F-M) was 
also employed. 

Creativity of Spontaneous Daydream and Story- 
telling. At the conclusion of the questionnaire, Ss 
were asked to write an account of an actual day- 
dream and also to make up a spontaneous original 
story. The daydream and original story were then 
scored for Creativity, i.e a measure of the intro- 
duction of novel materials, characters, time and space 
sequences, and emotional vividness. Using a definition 
of creativity in terms of the above criteria, two ex- 
aminers independently scored all protocols along a 
five-point scale for Creativity. Rater reliability for 
a larger sample of 240 Ss had been evaluated for 
this variable and was felt to be satisfactory, since in 
only 11 out of 240 ratings were there differences as 
great as two points on the five-point scale, and no 
difference as great as three points. The average of 
the two raters’ scores was employed for the final 
Creativity score, 

Needs Achievement, Self-Aggrandisement, and Af- 
filiation. In addition to scoring the structural charac- 
teristics of the story and daydream, some attempt 
was made to consider the specific thematic content 
of these materials. Three fantasy needs emerged with 
enough variability in most of the records to permit 
a quantitative rating. These were Need Achievement, 
scored essentially along the lines laid out by Mc- 
Clelland (1958) and Atkinson (1958), Need Self-Ag- 
grandizement (employed here as representing obtain- 
ing material possessions or display items, as well as 
high social status without particular effort or achieve- 
ment), and Need Affiliation (employed here to in- 
clude gregariousness, need for social warmth, and 
sex). It was thought that Need Achievement in par- 
ticular would relate to degree of daydream activity. 

The need scores were rated independently along a 
five-point scale and raters’ ‘results were averaged to 
give a final score for each S on each need, While 
these scores could not be considered experimentally 
independent of the Creativity score, the intercorrela- 
tion data below suggest that they cannot be consid- 
ered merely as reflections of the Creativity score, 

Vocabulary Score. To obtain a brief estimate of 
verbal intelligence, the multiple-choice vocabulary 
test from the IER Intelligence Scale CAVD was in- 
cluded. This test correlates 50 with a general in- 
telligence factor for a sample of adult males (Thorn- 
dike, Norris, & Morrill, 1952), and it provides a 
simply scored indicator of gross intellectual differ- 
ences. No specific hypotheses concerning the role of 
intelligence were formulated for this study, but the 
Vocabulary score was employed to evaluate the like- 
lihood that particular correlations which emerged 
might merely represent intelligence differences. 


RESULTS AND DISCUSSION 


Following dichotomization of the distribu- 
tion scores for each of the above variables at 
their medians, tetrachoric 7’s were calculated. 
The matrix of intercorrelations is presented in 
Table 1. 

Inspection of Table 1 reveals general sup- 
port for the hypotheses in the sense that sig- 
nificant correlations in the predicted direction 
emerged between the Daydream scores and 
Dream Recall Frequency, Perceived Similar- 
ity to Father minus Perceived Similarity to 
Mother, Creativity of Spontaneous Daydream 
and Storytelling material, and Need Achieve- 
ment. Significant positive correlations also 
emerged between Daydream score and the 
A scale, Need Self-Aggrandizement, Need Af- 
filiation and Father—Mother discrepancy. The 
Repression and Lie scales correlate negatively 
(at insignificant levels) with Daydream score, 
while Social Introversion correlates positively 
as predicted, but at a nonsignificant level, 
A simple graphic cluster analysis following 
Tryon (1939) reveals a fairly clear-cut pat- 
terning of the variables in this study. Day- 
dreams, Dreams. Social Introversion, Creativ- 
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ity, Need Achievement, (Father—Mother) , and 
(Self-Father)—(Self-Mother), and the Anx- 
iety scale show a distinct cluster roughly 
paralleling each other in the extent of inter- 
correlations in the matrix. The Repression 
and Lie scales appear to form the negative 
pole of what appears as a bipolar cluster. 
Only the Anxiety scale was correlated sig- 
nificantly with Vocabulary; Need Affiliation 
tends to follow the major cluster with some 
variations, and Need Self-Aggrandizement 
does so to a much lesser extent. 

Although paralleling the Daydream scale 
through most of the matrix, the A scale re- 
veals a unique pattern correlating negatively 
with Need Self-Aggrandizement and Need 
Affiliation and Vocabulary. Thus, Ss who re- 
port many problems tend to show fewer fan- 
tasy themes dealing with possession and status 
attainment or need for interpersonal contact. 
Social Introversion shows a somewhat simi- 
lar pattern to anxiety, with a particularly 
high positive correlation emerging with Need 
Achievement, while a moderate negative cor- 
relation is revealed with Need Self-Aggran- 
dizement. 


TABLE 1 
INTERCORRELATIONS BETWEEN DAYDREAM SCORE AND OTHER VARIABLES 


1 2 3 4 5 6 7 8 9 0 u n 13 
1. Daydreams i 
2. Dreams .36 
3. Repression —.28 —.22 
3. Anxiety scale .48 23 —.71 
5. Lie scale = 16 35° —.54 
6. Social 25 22 -—25 26 —04 
Introversion 
7. Father-Mother wt 00 —.23 45° —39 —04 
8. (Self-Father)— 45 25 —.52 3 —.39 04 —.49 
(Self-Mother) 
9, Creativity 48 20 —.40 42 —.48 5 53 2 
10. Need 71 68 —.48 .22 27 61 04 35 39 
Achievement 
11. Need Self- Al 20 -—33 —27 -—Al -325 4 37 26 48 
Aggrandizement 
12. Need 48 28 06 —.34 04 07 05 —05 47 39 54 
Affiliation 
13. Vocabulary OT 08 21 —54 —15 —.18 00 —.08 12 00 —18 .19 
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Easily the dominant variable in the cluster 
based on size of intercorrelations is Need 
Achievement. One might infer from this re- 
sult that much of the achievement need trans- 
lated into responsiveness to the test situation 
could account for High Daydream and Night 
Dream scores, as well as the Creativity and 
Anxiety scores. This causal type of explana- 
tion founders when the high correlation be- 
tween Achievement and (S-F)—(S-M) is con- 
sidered, since the latter variable does not lend 
itself to bias resulting from an achievement 
or acquiescence set. The concomitant varia- 
tions of the cluster in question seem account- 
able on a more complex or subtle basis. there- 
fore. 

As a further check on this point, an analy- 
sis was made of identification choices of the 
Ss. As part of the questionnaire, the women 
in this group were asked to list movie or stage 
personages, historical figures, and characters 
from literature whom they emulated or wanted 
to be like. Analysis of these choices for the 
High and Low Daydream score groups re- 
vealed a significant difference in the “mascu- 
linity” or “femininity” of identification figures. 
The Low Daydream Ss chose significantly 
more male figures or women engaged in 
largely masculine pursuits or characterized 
by traits thought of as predominantly mascu- 
line (e.g., Joan of Arc, Elizabeth I of Eng- 
land, Amelia Earhart). The greater “feminine 
role” or maternal identification of the High 
Daydream Ss, as well as their high Creativity 
scores in thematic material and the high 
Need Achievement scores, suggests support for 
the suggested relationship between acceptance 
of inner life or long-range aspirations on the 
basis of maternal identification. 

Evidence supporting the hypothesis relating 
maternal identification and daydreaming has 
also emerged for a group of male Ss of com- 
parable background. This male sample did 
not undergo the same experimental procedures 
except for the Daydream questionnaire and 
the self-ratings and will not be reported at 
length here. Daydream frequency was posi- 
tively correlated with Self-Father discrepancy 
(r = .30, N = 64) and negatively correlated 
with Self-Mother discrepancy (r = — .19, N 
= 68). The results suggest that even for these 
men, the tendency to perceive oneself as simi- 


lar to one’s mother and unlike one’s father is 
associated to some extent with reported day- 
dreaming frequency. These results suggest the 
fruitfulness of exploring daydreaming tend- 
encies and self-awareness variables in terms 
of their linkages to family constellations and 
patterns of learning within the family situa- 
tion or cultural milieu. Only in this way can 
we hope to move beyond mere classification 
toward more theoretically derived statements 
concerning the relationships of a dimension 
such as “acceptance of inner life” or “self- 
awareness” to the general framework of per- 
sonality development. 

In conclusion, it appears from these data 
that there is a general clustering of the vari- 
ables in a manner suggesting that these women 
differ along a dimension which might be 
termed self-awareness. We can, of course, 
never be sure that High Daydream Ss actu- 
ally do produce more daydreams and have 
more conscious achievement-aspirations than 
Low Daydream Ss. Operationally, we observe 
only that they accept these phenomena as 
part of their life-space and report them more 
readily. It appears likely, however, that the 
difference in attention and admission to others 
of these inner experiences may be the psycho- 
logically significant phenomenon and that 
quantitative differences in extent of inner liy- 
ing may be scientifically indeterminable. 


SUMMARY 


The investigation described here represents 
one approach to a study of the functional 
role of daydreaming as a dimension of be- 
havior. It was hypothesized on the basis of 
various theoretical formulations that Ss who 
report a high frequency of daydreaming þe- 
havior also indicate greater frequency of re- 
call of night dreams, creativity in storytelling, 
Need Achievement, and, possibly, willingness 
to admit anxieties or complaints. These High 
Daydream Ss were expected also to demon- 
strate greater assumed similarity to their 
mothers than to their fathers and less evi- 
dence of repression or denial (MMPI Lie 
scale) than low frequency daydreamers. A 
group of 44 adult female graduate students 
responded to a variety of questionnaire ma- 
terials and also reported frequencies of day- 
dreams and night dreams. The test materials 
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included MMPI Anxiety (A) scale, Repres- 
sion scale, Lie scale, and Social Introversion 
(Si), as well as a series of interest items to 
be filled out by each S for herself, ideal self, 
and as her mother and her father would have 
done. Spontaneous storytelling and daydream 
material were also elicited and scored for 
Creativity, Need Achievement, Need Affilia- 
tion, and Need Self-Aggrandizement. The re- 
sults supported the general hypothesis, indi- 
cating that daydream frequency, night dream 
recall frequency, thematic creativity, Need 
Achievement, anxiety, and relatively greater 
identification with mother than with father 
intercorrelated positively, while Repression 
and Lie scales both correlated negatively with 
the other variables in the cluster. The data 
suggest that High and Low Daydreamers 
differ along a dimension which might be 
termed self-awareness, or acceptance of inner 
experience. 
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PERFORMANCE UNDER STRESS IN RELATION TO 
INTELLECTUAL CONTROL AND SELF-ACCEPTANCE? 


ALLAN GOLDFARB 
Baltimore City Health Department 


In an experimental study, Williams (1947) 
found a highly significant relationship between 
Rorschach indices of intellectual control and 
performance under stress on the Wechsler- 
Bellevue Digit-Symbol test. His results sup- 
ported the validity of the Rorschach con- 
structs, indicated that the Rorschach test is 
a practical instrument for predicting behavior 
under stress, and implied that efficiency of 
performance under conditions of stress is 
mainly a function of intellectual control. This 
was pointed out by Carlson and Lazarus 
(1953), who questioned the representative- 
ness of Williams’ findings because of his ex- 
perimental procedures and the conflicting re- 
sults reported in related studies. They re- 
peated Williams’ experiment and did not 
obtain comparable correlations. These dis- 
crepant findings are similar to those contained 
in a review article on stress by Lazarus, 
Deese, and Osler (1952). 

The present research represents a modifica- 
tion of Williams’ experimental procedures in 
two important respects: in that a real-life 
stress situation was produced, and the sub- 
jects (Ss) had a common motivation to suc- 
ceed. Measures of self-acceptance were also 
obtained from the Ss. This personality vari- 
able was studied because it is a basic com- 
ponent of psychological theory which indi- 
cates that effectiveness of behavior is directly 
related to self-acceptance (Rogers, 1951; 
Snygg & Combs, 1949; Symonds, 1951). Of 
further interest is the congruity between the 
characteristics of a self-accepting person and 
the concept of a mature individual having 


1 This paper is based upon a doctoral dissertation 
completed at the University of Pittsburgh in 1954. 
The writer is indebted to members of his thesis com- 
mittee, J. Matthews, A. W. Bendig, H. W. Goodman, 
and A. D. Lazovik, for their guidance and encourage- 


ment. 


adequate intellectual control, as depicted by 
Beck (1945). The prediction was made that 
acceptance of self would be highly correlated 


with the Rorschach indices of intellectual 
control. 

METHOD 
Subjects 


All the pledges (N = 30) of a campus fraternity 
volunteered to participate in this study, which was 
described as being done to investigate certain impor- 
tant questions facing clinical psychologists. These 
pledges had been selected by the fraternity from a 
large number of applicants; they were all highly 
motivated to become active members. Achieving ac- 
tive status was dependent upon the over-all impres- 
sion made by each pledge on the fraternity members 
This factor was especially crucial during the time 
this study was being done, since the Ss were in the 
trial stage of their pledge period. Consequently. an 
important motivation common to this group of Ss 
was to make a favorable impression on the fra- 
ternity members. 


Procedure 


In the first phase of the stud: the Ror: 
was administered individually ms eee of om li 
based on Beck’s (1944) procedures, This took a 3 
proximately 1 month. i 

The Ss then met as a group in a classroom to com- 
plete five practice trials on the Wechsler-Bellevue 
Digit-Symbol test (Wechsler, 1944) , During the ad- 
ministration of the sample form of the Digit-Symbol 
test the experimenter (E), in giving instructions to 
the Sroup, referred to the identical sample form 
which had been reproduced on the blackboard. Each 
of five trials of the Digit-Symbol test was completed 
within 90 seconds, with a 1-minute rest period be 
tween each trial. The number of items on the Di it- 
Symbol test had been increased so that the com; ite 
test would not be finished in the allotted time "The 
Ss then completed the Berger Scale of Expressed ie. 
ceptance of Self (Berger, 1952) which yielded th 
Control Level scores. The Berger scale is sola 
ministering and is composed of 36 items Selection 
of these items was made according to the definitio: 
of a self-accepting person derived in an earlier tday 
by Sheerer (1949). The respondent rates each ed 
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on a scale from 1 to 5 depending upon how true he 
feels the item to be in describing himself. To be valid, 
use of the Berger scale requires that the Ss be un- 
identified. As it was necessary for the purpose of this 
study to identify the pledges in order to compare 
their responses under control and stress conditions, 
the Berger scale was covertly coded. 

The second group session took place the day after 
the first meeting. Twelve volunteers from the group 
of assembled pledges were taken to another room, 
where they completed six additional trials of the 
Digit-Symbol test. A 3-minute rest period was taken 
at the end of the third trial, instead of the standard 
1-minute interval. The results of the last three trials 
of this series yielded the Digit-Symbol Control Level 
scores, 

For the final phase of the experiment, the 12 Ss 
were taken to the Psychology department's labora- 
tory in another part of the building to take three 
more trials of the Digit-Symbol test (Stress Level 
scores). This procedure of having the Ss complete 
the Digit-Symbol test under control and stress con- 
ditions yielded the behavioral criterion (decrement 
in performance on the Digit-Symbol test) patterned 
after Williams’ study. In addition, the current cri- 
terion measure was designed to incorporate and em- 
phasize psychological stress variables which would be 
Stressful to a pledge to the degree that he was lack- 
ing in the characteristics of a self-accepting person 
as defined by Berger (1952). This was accomplished 
by exposing the Ss to psychological stress which com- 
prised externally applied pressures as a standard for 
behavior, maximized their need to deny or distort 
unacceptable personal characteristics, and indicated 
that public comparisons would be made of their in- 
dividual performance results. 

In the laboratory, two identical continuous panels 
had been constructed and placed back-to-back 5 
inches apart in the center of a long table, with six 
positions on each side, Each section of the panel fac- 
ing the S included a white light and a red light. Ex- 
tending perpendicularly from the main panel on each 
side of a given position was a smaller panel. All the 
panels were 1 foot high. Consequently, when the Ss 
were completing the experimental tasks, they could 
not see the progress made by any of the adjacent 
pledges. All the lights in the laboratory were on; 
also, two No. 1 photoflood lamps mounted on tri- 
pods were placed on the table, one at each end, and 
beamed at the pledges without causing a direct glare. 
At one end of the long table in full view of the Ss 
was the shocking apparatus. This consisted of an in- 
clined panel board which had separate switches for 
the lights and the electric shock, and was the ter- 
minal point for the maze of wires which led from 
the apparatus to the individual positions. 

After the Ss were seated, electrodes were attached 
to the nonwriting hand of each pledge by E. He 
then stood at the end of the table near the shocking 
apparatus, where he could easily be seen by all the 
pledges. The following instructions were given, pat- 
terned after those of Williams: 


You are now being observed by a number of 
psychologists who are taking notes and continuous 
Photographs of all your reactions throughout the 
remainder of this experiment, [At one end of the 
table stood a graduate student who operated a 
portable motion picture camera, Immediately be- 
hind each group of three Ss stood a graduate stu- 
dent who served as a judge; the four judges aus 
cluded one female. There was no communication 
between them while the Ss completed the tests. 
During the experiment and the rest intervals, they 
took sham notes of the Ss’ behavior. At the end of 
cach trial the judges shifted position, which served 
to randomize the effect of any particular judge on 
the S.] All directions are to be followed implicitly: 
Rest your arm attached to the electrodes on the 
table and keep it there from now on, You will 
notice that the electrodes on your arm are now 
connected to the panel before you. [White light 
turned on and left on.] The white light that has 
just gone on indicates that our shock apparatus 
has been turned on. You are connected to this aP- 
paratus. During the following period you may re- 
ceive a strong electric shock whenever the ob- 
servers feel that your test performance is not up 
to our standards. Whenever the red light goes on, 
you are not meeting our standards and you are in 
danger of being shocked, like this. [Red light 
turned on individually for each pledge, and fol- 
lowed by an electric shock. Red light turned off] 
Based upon the psychologists’ evaluation of your 
reactions and your tests, each of you will be com- 
pared to all the rest of the pledges. You will be 
compared for personality factors and intelligence. 
These lists will be posted in your fraternity house 
2 weeks from now, so that all of you can see how 
you compare with the rest of the group. [These 
results posed a realistic threat to the Ss, who were 
all highly motivated to make a favorable impres- 
sion on the fraternity members in order to achieve 
active status in the fraternity.] Now pick up your 
Pencil and write your name, seat number, and 
group number in the upper right-hand corner. You 
will see that this is the same test form you just 
took downstairs. Your instructions are the same 25 
before. At the signal, “Go,” turn the sheet over 
and work as fast as you can until you are told tO 
Stop. Concentrate on your work. Remember, you 
are being observed and continually photographed. 
Your work will be compared with the rest of the 
pledges, and you will be shocked whenever your 
work falls below our standards. Get set for Trial 1- 
Go! 
Three seconds after the Go signal, the red light 


was turned on. After a 5-second interval, the electric 


shock was administered ; immediately afterward, the 
red light was turned i) 


delivered through the electr 
terval timer which activate 
Each pledge could be individually shocked, since the 
shocking apparatus was connected to a 12-pole posi- 
tion switch. After the shock had been on for 0.4 
second, it was automatically turned off, Simultane- 
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ously, the red light was turned off. The electric shock 
was then given to the S whose position number next 
appeared on £’s list, which had been previously de- 
rived by chance selection of the panel position num- 
bers.) Ninety seconds after starting Trial 1, the 
pledges were told to “Stop.” During the 1-minute 
interval before being instructed to start again, the 
pledges were again informed of the use to be made 
of their performance results. The same procedure 
was repeated in Trials 2 and 3. 

After Trial 3 of the Digit-Symbol test was con- 
cluded, the electrodes were removed from each S's 
hand. At that time, to ascertain the sensitivity of 
the Berger scale to situational influences, the test 
was taken by the Ss under stress conditions (Stress 
Level scores). They were given instructions similar 
to those used during the Digit-Symbol testing. As 
the Ss left after finishing the Berger scale, they were 
cautioned not to return to the room where the other 
pledges were gathered. The identical experimental 
procedures were repeated with the remaining pledges, 
the second group including 12 Ss, and the third 
group consisting of 6 Ss. 

Shortly after completion of the study, E met with 
the pledges and explained to them the purposes of 
the research. They were assured of the confidentiality 
of the data, and that their performance in the study 
would have no bearing on their status in the fra- 
ternity. 


RESULTS 


The Rorschach records were individually 
scored according to Beck (1944) and the fol- 
lowing three measures of intellectual control 
were derived: F+% for the total record, 
F+% for the color cards alone, and Sum C/ 
Total C. A high F+% is held to indicate a 


TABLE 1 
SUMMARY OF RORSCHACH PERFORMANCE 


Experimental group 


Rorschach Category Mean Range SD 
Sum C/Total C 
Goldfarb 80. 0-1.2 -28 
Carlson & Lazarus 83 .5-1.0 18 
Williams 88 5-1.2 -20 
F+% Total 
Goldfarb 77.0 52-100 12.28 
Carlson & Lazarus 76.8 50-100 13.31 
Williams 81.6 70-100 6.59 
F+% Color Cards 
Goldfarb 72.5 0-100 20.83 
Carlson & Lazarus 70.1 0-100 26.40 
Williams 75.5 50-100 9.28 


TABLE 2 


SUMMARY or DIGIT-SYMBOL z 
TEST PERFORMANCE 


Experimental Group 


Digit-Symbol Measure Mean SD 
1. Control Level 
Goldfarb 88.15 17.16 
Carlson & Lazarus 85.1 16.69 
2. Stress Level 
Goldfarb 79.63 14.92 
Carlson & Lazarus 79.3 15.19 
Stress Decrement (1 minus 2) 
Goldfarb 8.52 7.67 
Carlson & Lazarus 5.8 8.34 
Williams 10.4 5.70 
ł test 
Goldfarb 5.26* 
Carlson & Lazarus 3.41* 


* Significant at .01 level. 


high degree of intellectual control, while the 
converse relationship is stated for the Sum C/ 
Total C measure (Beck, 1944). 

Table 1 shows that the Ss’ scores for the 
specified Rorschach measures are very con- 
sistent with those reported by Williams and 
by Carlson and Lazarus. This signifies that 
similar groups of Ss were used in all three 
studies. 

The group of pledges reached a plateau of 
no further improvement on the Digit-Symbol 
test by Trial 11. This finding corresponded 
with the results found by Williams, which 
was not the case in the Carlson and Lazarus 
study. The Stress Decrement scores in the 
current study very likely represented a decre- 
ment from a level of maximum performance 
for the pledges. The measure of Stress Decre- 
ment was computed by determining the mean 
number of digits correctly completed for 
Trials 9, 10, and 11 (Control Level) minus 
the mean number correctly completed for 
Trials 12, 13, and 14 (Stress Level). In this 
study, and in the other two, the magnitude of 
the Stress Decrement measures was indicative 
that all Es produced comparable stressful con- 


ditions by their procedures. These data are 
contained in Table 2. 
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TABLE 3 


INTERCORRELATIONS AMONG DIGIT- 
SYMBOL Test MEASURES 


Corrected 
Stress 
Control Decre- 
Digit-Symbol measure Level ment 
Stress Decrement 
Goldfarb 46 
Carlson & Lazarus 42 
Williams 05 
Corrected Stress Decrement 
Goldfarb —.09 84 
Carlson & Lazarus —.01 20 
Improvement under Stress* 
Goldfarb 19 —.25 —.34 
Carlson & Lazarus 09 33 32 
Stress Maximum» 
Goldfarb 63 -93 
Carlson & Lazarus SL 94 
Williams 93 
Stress Level 
Goldfarb 86 
Carlson & Lazarus 37 


^ Last stress trial minus first stress trial. 
Highest control trial minus lowest stress trial. 


Perhaps a more valid criterion measure of 
experimental stress for the Ss’ performance 
on the Digit-Symbol test is the Stress Maxi- 
mum score found by subtracting the first 
Stress trial (Trial 12) from the last control 
trial (Trial 11). It is suggested that Trials 
12, 13, and 14 comprise not only decrement 
in performance under stress, but also include 
the factor of recovery from stress. The fol- 
lowing analysis supports this hypothesis. The 
difference between the successively larger 
mean scores made on Trials 12 and 13 was 
found to be statistically significant beyond 
the .01 level (¢ = 7.56). However, the mean 
difference in scores made on Trials 13 and 
14 was found to be insignificant (¢ = .41). 
A possible interpretation of these findings is 
that Trial 12 represented the primary reac- 
tion to the stress situation, while Trials 13 
and 14 also reflected the Ss’ efforts to recover 
from and stabilize their reactions to the stress 
condition. If one were to use the Stress Maxi- 
mum score as a measure of experimental 
stress, then the average decrement for this 
sample was a drop of 15 points. This ex- 
tremely significant decrement in the Ss’ per- 
formance on the Digit-Symbol test very likely 
indicates the period of the greatest influence 
of the stress factors. Since the Stress Maxi- 
mum scores correlated very highly with the 
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Stress Decrement scores (r= .93),2 as was 
also found with the Williams and the Carlson 
and Lazarus studies, they were not correlated 
with the other measures. 

The intercorrelations among the perform- 
ance test measures are presented in Table 3. 
A correlation of .36 is required for significance 
at the .05 level. In this research, as in the one 
by Carlson and Lazarus, the degree of Stress 
Decrement on the Digit-Symbol test is corre- 
lated with the Control Level. To eliminate the 
influence of the Control Level of performance, 
a statistical procedure patterned after that of 
Carlson and Lazarus (1953, p. 250) was used 
to derive the Corrected Stress Decrement 
scores. Although these results were signifi- 
cantly correlated at the .05 level with the 
measure of Improvement under Stress, the 
latter scores were also correlated with the 
various personality indices in order to com- 
pare them with the Carlson and Lazarus find- 
ings. The Improvement under Stress scores 
were obtained by subtracting the score for 
Trial 12 (first trial under stress) from the 
score for Trial 14 (last trial under stress). 

Table 4 indicates that in this study no sig- 
nificant correlations were found between the 
Rorschach indices of intellectual control and 
performance under stress, as measured on the 
Digit-Symbol test. These findings do not sup- 
port the hypotheses that the Rorschach test 

2 All coefficients of correlation reported in this 


study were derived by the Pearson product-moment 
method. 


TABLE 4 


CORRELATIONS BETWEEN Dicit-Sympot Test 
MEASURES AND RORSCHACH MEASURES 


Sum C F+% 
F+-% Color 
Digit-Symbol Measure Total C Total Cards 
Stress Decrement 
Goldfarb —02 —14 —,07 
Carlson & Lazarus —.37 16 06 
Williams 35 —61 —.72 
Corrected Stress Decrement 
Goldfarb —.03 —02 00 
Carlson & Lazarus = 298 05 02 
Improvement under Stress 
Goldfarb 14 .13 —.08 


Carlson & Lazarus —.07 412 -03 


| 
| 
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TABLE 5 


SUMMARY OF BERGER SCALE PERFORMANCE 


Experimental 


Group 
Berger Scale Measure Mean SD 
1. Stress Level 145.03 14.07 
2. Control Level 137.87 17.29 
(135.50)" (22.36) 
Stress Level minus Control Level® 7.16 9.09 
t test 4.09* 


a Data in parentheses obtained by Berger with day college 
students, £ i s. 

b Measure of Increase in Self-Acceptance (Stress). 

* Significant at .01 level. 


can be used to predict behavior under stress, 
and that reactions to stress are mainly a func- 
tion of the personality variable of intellectual 
control. Furthermore, the validity of these 
Rorschach constructs is not confirmed. The 
results of the current study are in marked 
contrast to those found by Williams, but are 
consistent with the correlations obtained by 
Carlson and Lazarus. 

The mean difference in the Ss’ scores on the 
Berger scale obtained under control and stress 
conditions was statistically significant at the 
.01 level. This finding supports the validity 
of the psychological stress experienced by the 
Ss; indicates that the Berger scale provides 
a sensitive measure of self-acceptance; and 
reveals that the pledges as a group presented 
themselves as being more self-accepting, i.e., 
more mature and independent, when they 
learned that their responses to the Berger 
scale would be made public. Comparison of 
the present results derived under control con- 
ditions with the findings obtained by Berger 
with a group of day college students, a 
sample comparable to the pledges used in 
this study, indicated consistency of results. 
Table 5 lists these results. 

Analysis of the intercorrelations among the 
Berger scale measures indicated that Increase 
in Self-Acceptance (Stress) was significantly 
correlated at the .05 level with the Control 
Level (r = .37). To eliminate the influence of 
the Control Level of performance, Carlson 
and Lazarus’ (1953, p. 250) statistical pro- 
cedures were followed to derive the measure 
designated the Corrected Increase in Self-Ac- 


ceptance (Stress). The latter measure corre- 
lated —.05 with the Control Level scores, 
and .79 with the Increase in Self-Acceptance 
(Stress) scores. The correlation between the 
Control Level scores with the Stress Level 
scores was .84. . 

The correlations between the Berger scale 
measures and the Rorschach indices of intel- 
lectual control included one statistically sig- 
nificant relationship. The measure of Cor- 
rected Increase in Self-Acceptance (Stress) 
was found to be negatively correlated at the 
-05 level with the F+% on the Rorschach 
color cards (yr = — .38). This suggests that 
the Ss with lesser degrees of intellectual con- 
trol tend to present themselves as being more 
self-accepting under conditions of stress. 

No significant relationships were found be- 
tween Ss’ performance on the Digit-Symbol 
test and the measures of self-acceptance ob- 
tained under control and stress conditions. 
This finding neither supports the hypothesis 
that a major personality correlate of behay- 
ior under stress is the variable of self-accept- 
ance, nor does it indicate that the Berger 
scale can be used to predict performance un- 
der stressful conditions. 


Discussion 


No significant relationships were found in 
the present study between performance un- 
der stress and the personality variables of in- 
tellectual control and self-acceptance. These 
Rorschach findings match those of Carlson 
and Lazarus, but differ markedly from Wil- 
liams’ results. 

The possibility that the experimental pro- 
cedures may have obscured significant rela- 
tionships merits further ‘study. This concerns 
the practice of using a single score as a meas- 
ure of the S’s performance under stress which, 
in turn, is correlated with other scores repre- 
senting personality variables, A performance 
score may mask several important components 
and patterns of behavior. These may not only 
vary between Ss who achieve identical scores 
but, if separately correlated with selected com- 
ponents of the personality variables, could 
conceivably yield significant relationships (see 
the excellent discussion of this problem by 
Lazarus et al., 1952). 

The increased mean score made by the 
pledges taking the Berger scale under condi- 
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tions of stress may be interpreted as a de- 
fensive reaction. This is based on the premise 
that the pledges experienced psychological 
stress in anticipation that their personal char- 
acteristics were not as acceptable to the fra- 
ternity members as was indicated on the 
Berger scale under control conditions, From 
this viewpoint, rather than being solely a 
measure of self-acceptance, the increased mean 
scores include an emotional component of de- 
fensive behavior. 

No relationship was found between the 
Berger scale taken under control conditions 
and the Rorschach test. This finding does not 
support the initial hypothesis that greater de- 
grees of self-acceptance are associated with 
increased intellectual control. The significant 
negative correlation found between the vari- 
ables of Corrected Increase in Self-Acceptance 
(Stress) on the Berger scale and F+% on 
the color cards of the Rorschach test may be 
viewed as suggesting that, the increased mean 
score of self-acceptance obtained under con- 
ditions of stress primarily reflects defensive 
behavior, and increased defensiveness by the 
pledges is related to correspondingly lesser 
degrees of intellectual control. It should be 
noted that this significant correlation may 
have arisen by chance, since it was one of a 
much larger number of relationships investi- 
gated in the study. 


SUMMARY 


The present study investigated the relation- 
ship between performance under stress and 
the personality variables of intellectual con- 
trol and self-acceptance. In an attempt to pro- 
vide a more valid and definitive test of these 
relationships, the Ss were presented with a 
realistic stress situation and had a common 
motivation to succeed. The behavioral cri- 
terion was decrement in performance on the 
Digit-Symbol test. Measures of intellectual 
control were derived from the Rorschach test, 
and the Berger scale was used to obtain meas- 
ures of self acceptance. The following re- 
search procedure was observed: (a) the Ror- 
schach test was administered to each of 30 
pledges, which took approximately 1 month; 
(b) the pledges then met as a group and com- 
pleted five practice trials on the Digit-Symbol 
test, and took the Berger scale under control 
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conditions: (c) the following day the pledges 
met again as a group to complete six more 
trials on the Digit-Symbol test, the latter 
three serving as control measures: and (d) 
they were then immediately taken to the ex- 
perimental laboratory to complete three more 
trials on the Digit-Symbol test and to take 
the Berger scale under conditions of stress. 

The major conclusions to be drawn from 
this study based on the experimental condi- 
tions are as follows: (a) support is lacking 
for use of either the Rorschach test or the 
Berger scale to predict performance under 
stress, (b) the personality variables of intel- 
lectual control and self-acceptance do not ap- 
pear to be major correlates of behavior under 
stress, and (c) confirmation is lacking for the 
validity of the Rorschach constructs. The 
Rorschach findings are consistent with those 
of Carlson and Lazarus, who did not obtain 
comparable results in a duplicate study of 
that done by Williams. 
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SOCIAL DESIRABILITY AND RESPONSE BIAS 
IN THE MMPI 


CHARLES HANLEY 


Michigan State University 


Two sources of individual differences in re- 
sponses to self-report inventories stand apart 
from traditional concepts of personality. The 
first is the degree to which subjects (Ss) are 
affected by the “social desirability” of atti- 
tudes expressed in inventory items. Scales 
that measure “defensiveness,” “plus-getting,” 
“dissimulation,” and “malingering” focus on 
this factor. The second is seen when Ss are 
influenced by the form of the answer sheet. 
Measures of “acquiescence,” “response set,” 
and “response bias” are concerned with the 
effects of response categories. Both kinds of 
measure, it is hoped, will ultimately be use- 
ful in suppressing personality scale variance 
irrelevant in diagnosis and screening. 

A recent study reported by Wiggins (1959) 
yields important information on a number of 
scales used for the MMPI. Eleven different 
measures, nine of which deal with some as- 
pect of social desirability, are compared and 
found to differ widely in ability to discrimi- 
nate between protocols of undergraduates in- 
structed to give the socially desirable answer 
to each MMPI item and protocols obtained 
under standard instructions. Wiggins draws 
conclusions from this study that raise general 
questions regarding past and future work with 
measures of social desirability. Wiggins dis- 
tinguishes two approaches to the measure- 
ment of test taking defensiveness; these differ 
in the manner in which scales are constructed 
and originally validated. A measure built to 
discriminate Ss given instructions aimed at 
maximizing defensiveness from Ss taking the 
inventory under normal conditions has been 
constructed by the “empirical” method and 
possesses “empirical” validity. A scale suc- 
cessfully devised to correlate in expected di- 
rections with diagnostic scales has been con- 
structed by the “rational” method and has 
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“rational” validity. The two most effective 
measures in Wiggins’ study are both empiri- 
cal scales. Wiggins concludes that “empirical 
methods are the methods of choice” (p. 427) 
in constructing measures of social desirability. 
From analysis of correlations between the 
various measures, Wiggins suggests that ear- 
lier studies, presumably those using rational 
methods, “would be more appropriately con- 
sidered as studies of response bias” (p. 426). 
Finally, from reading the paper it is difficult 
to escape the impression that the empirical 
method of validation he employs is a close 
approximation of the real life screening and 
diagnostic situation. 

The purpose of the present paper is to ex- 
amine: (a) the degree to which effectiveness 
in his study is consistently related to the em- 
pirical-rational distinction as well as to other 
dimensions he has not considered, (b) whether 
the influence of response bias in rational 
measures is as clear as he suggests, and (c) 
whether empirical validation, when employed 
with specially instructed and standard groups, 
is free from defects specific to the procedure. 
The first point can be clarified by detailed 
examination of Wiggins’ data. The second 
and third require additional data obtained 
for the purpose. The analysis that follows is 
not intended to dispute the potential effec- 
tiveness in the real life situation of any spe- 
cific scale, but rather to consider general 
procedural questions that bear on the con- 
struction of useful measures of the influence 
of social desirability. 


CLASSIFICATION AND EFFECTIVENESS 
or MEASURES 
Characteristics of measures of defensiveness 


can be illustrated by referring to eight scales 
studied by Wiggins. (Three others are omitted 
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TABLE 1 


CONSTRUCTION AND EFFECTIVENESS OF MMPI Measures IN WIGGINS 


(1959) Srupy 


Item Response Effectiveness 

Scale Validation Content Frequencies Aim (phi coeff.) 
Sd Empirical Explicit Yes Defensiveness -7217 
Cof Empirical Implicit Yes Defensiveness O19 

É None Explicit (Guessed) Defensiveness 539 

Ex Rational Explicit Yes Def. & Plus-get. 161 

SD» Rational Explicit No Defensiveness 330 

K Empirical None? Yes Def. & Plus-get. 217 

Ds Empirical Implicit Yes Dissimulation ot 

B Rational None Yes Response Bias = 

_ y o _ = 


* ,683 on cross-validation. 
è Labeled En in original study. 
© Except for cight items. 


in the interest of simplicity; they play a 
minor part in his analysis.) These are L and 
K, both standard MMPI measures, Edwards’ 
SD (Fordyce, 1956), Ex (Hanley, 1957), 
Cof (Cofer, Chance, & Judson, 1949), Sd 
(Wiggins, 1959), Ds (Gough, 1954), and B 
(Bricke, 1957). All but the last are concerned 
with social desirability. 

Wiggins emphasizes the importance of 
method of validation. To it should be added: 
(a) use of the results of some type of judg- 
ment of item content in determining whether 
or not to include items in a scale, (b) use of 
response frequencies to determine inclusion or 
rejection of items, and (c) the original aim of 
the scale. Similarities and differences among 
the eight measures with respect to all of these 
variables are summarized in Table 1. 

Method of Validation, Table 1 indicates 
that empirical and rational procedures were 
used in the original validation of most of the 
scales. The L scale, however, was included in 
the MMPI without a validity study. 

Item Content. Item selection for several 
scales was wholly or partly dependent on ex- 
plicit judgments of item content. The L scale, 
for example, consists of items written to allow 
defensive individuals to claim unrealistically 
favorable traits. Edwards selected items for 
his SD measure after 10 judges gave socially 
desirable answers to a pool of F, K, and Tay- 
lor MAS items. Judgments of item content 
also played an important role in the construc- 
tion of the Sd and Ex measures. oo 

The Cof and Ds scales were derived in part 
by having certain Ss “fake” roles. These in- 


structions seem to involve implicit judgment 
of item content on the part of such Ss. Eight 
of the 30 K items were also chosen on thé 
basis of results of a faking study (Meehl € 
Hathaway, 1946, p. 543)... 

Item content was not considered in the 
derivation of Fricke’s B measure. Selectio” 
without attention to individual item content 
can be illustrated in the case of the 22 
items that constitute the L6 scale (Meehl & 
Hathaway, 1946). 


In brief, L6 was derived by an item analysis of the — 


responses of 25 males and 25 females in the psycho 
pathic hospital whose profiles showed an L score © 
T = 60 or more and who, with the exception of s* 
normal cases, had diagnoses indicating the probabil- 
ity that they should have had abnormal profiles bu! 
whose profiles were in reality within the norm 
range (p. 540), 


The item responses of these fifty cases handled sep?” 
rately for males and females were compared to 
male and female item frequencies from the gene! 
group of males and females that has been used 
past scale derivations, In all, 22 items were chose? 
as a result of this comparison (p. 541). 


After these items had been selected, Meebl 
and Hathaway described them as giving a 
“over-all impression” of “impunitiveness” (P 
541). That selection on the basis of item co?” 
tent is not the same as interpretation follow 
ing selection is indicated by the fate of Book 
let Item 461, keyed “true” on Sd, Cof, a? 
Ex but “false” on K. 

Response Frequencies, Several measures 
were derived wholly or partly by use of ren 
sponse frequencies obtained from groups t i 
ing the MMPI. The quotations from Mee” 
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and Hathaway given in the preceding para- 
graph indicate the use of frequency data in 
selecting items for the Z6 scale incorporated 
into K. To obtain the remaining eight K 
items, moreover, response frequencies also 
played a role. Items in the Ex pool were in- 
cluded only if 36-64% of Hathaway’s college 
sample had endorsed them. The Cof, Ds, and 
Sd scales were constructed in part by com- 
paring frequencies of endorsement obtained 
from various groups given special and stand- 
ard instructions. 

Response frequencies were not used in the 
construction of the SD measure, although 
some of its items were drawn from the K 
pool and thus remotely reflect response data. 
The authors of the Z scale did not inspect re- 
sponse frequencies for their 15 items but as- 
sumed that few honest persons could answer 
them -in the socially desirable direction. 

Items for the B scale, a measure of re- 
sponse bias, were chosen entirely on the basis 
of response frequencies, only those endorsed 
by 40-60% of Hathaway’s normative sample 

ing used. s 
Des Most of the measures in Table 1 are 
oriented toward test taking defensiveness, the 
tendency to give socially desirable rather than 
personally relevant answers. K and Ex, how- 
ever, also aim at plus-getting, the tendency to 
be overly critical of oneself. The Ds scale is 
directed at the detection of „deliberate plus- 
getting. The B scale, as indicated before, is 
aimed at response bias rather than defensive- 
ness. R s 

Effectiveness. Wiggins presents extensive 
data on relative effectiveness of the various 
scales in discriminating between a sample of 
250 college students instructed to give the 
socially desirable response to each MMPI 
item and a sample of 190 students taking the 
inventory under standard conditions. Wiggins 
determined mean scores separately for men 
and women. Scales that significantly differen- 
tiated records obtained under the two condi- 
tions were analyzed to estimate the degree to 
which this differentiation was accurate. His 
data, expressed as phi coefficients, are shown 
in the last column of Table 1. These are 
based on pooled male and female protocols. 

Using the categories in Table 1, we can ex- 
amine in order the qualities associated with 


effectiveness in Wiggins’ study. First is valida- 
tion. The Z scale does well despite lack of any 
original validation, although it is by far the 
shortest scale studied. While the most effec- 
tive measures are the empirical Sd and Cof 
scales, the equally empirical K scale is the 
worst of the lot. Empirical validation, it ap- 
pears, has no systematic advantage over other 
types. 

A noticeable characteristic of effective meas- 
ures appears in the item content column of 
Table 1. The single defensiveness scale not 
consistently employing attention to item con- 
tent is K, which fares badly. 

Data on response frequencies are equally 
useful. The one defensiveness measure not 
using such information is SD, which is rela- 
tively ineffective. Even “guessed” response 
frequencies, as in the case of L, are better 
than none according to Wiggins’ results, 

Comparison of the Aim and the Effective- 
ness columns indicates scales designed to 
measure defensiveness may have some ad- 
vantage in Wiggins’ study over scales with 
broader aims. Ex, devised to measure both 
defensiveness and plus-getting, does not fare 
badly, but K, with similar aims, is ineffec- 
tive. Scales oriented toward behavior other 
than defensiveness, as in the case of Ds and 
B, are completely ineffective, a result that is 
not unexpected. 

In summary, the entries in Table 1 indi- 
cate that several characteristics distinguish 
between measures that were effective and in- 
effective in Wiggins’ investigation. Lack of 
attention to item content and response fre- 
quencies are more clearly associated with 
ineffectiveness than is the empirical-rational 
dimension, The advantage for empirical vali- 
dation is not as systematic as Wiggins’ con- 
clusions indicate. 


RATIONAL VALIDITY AND RESPONSE Bras 


The superiority of Sd and Cof is based 
solely upon empirical validation, For Ex, K, 
and SD, Wiggins’ results reveal superior ra- 
tional validity, that is, higher correlations 
with MMPI diagnostic keys. Supporting his 
preference for the empirical approach is the 
suggestion that these correlations between de- 
fensiveness and diagnostic measures result 
from response bias. He Presents additional 
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data showing rational measures to be highly 
correlated with Fricke’s B scale. These same 
data, however, indicate that the empirical K 
measure also correlates highly with B. Care- 
ful consideration of these results is needed. 

The measurement of response bias is based 
entirely on rational validity. Any scoring key 
with an imbalance of “true” and “false” re- 
sponses is expected to correlate with any 
other imbalanced key. When Wiggins reports 
a correlation of —.638 between K and Sc, for 
example, it can be attributed to response bias, 
because all but one of the 30 K items are 
keyed false, and 59 of the 78 Sc items are 
keyed true. The S set to answer true should 
get a high score on Sc and a low one on K. 
The reverse holds for the person biased to 
answer false. Individual differences in defen- 
siveness, however, lead to the same empirical 
result. 

In devising B, Fricke (1957) assumed that 
items of greatest “controversiality” (i.e., items 
yielding nearly equal numbers of true and 
false responses) are most susceptible to re- 
sponse bias. The B scale is composed of all 
MMPI items endorsed by 40-60% of Hath- 
away’s normal samples and not appearing on 
K. As Table 1 indicates, B is a rational meas- 
ure constructed entirely from response fre- 
quencies. 

Securing adequate measures of response 
bias is made difficult by questions as to the 
existence of several such biases (Jackson & 
Messick, 1958; Hanley, 1959). If these prob- 
lems are set aside, however, another difficulty 
arises. Should items on the MMPI express 
undesirable traits more often than desirable 
ones, the use of response frequencies alone in 
item selection places the psychologist at the 
mercy of the manner in which the authors of 
the inventory worded their items. A scale 
based entirely on response data may have an 
excess of items describing undesirable charac- 
teristics. If this occurs, response bias and de- 
fensiveness are confounded. An individual will 
tend to obtain a low score, for example, by 
giving socially desirable answers to items. 
Correlations between the response bias meas- 
ure and defensiveness scales then would be 
partly due to the role of social desirability. 
This has been suggested regarding correla- 


tions between Fricke’s OAIS Set T scale 
(1956) and MMPI measures (Hanley, 1957). 

The correlations between B and defensive- 
ness measures are inconclusive if it can be 
shown that B is affected by social desirability. 
Extensive as Wiggins’ data are, additional in- 
formation is needed to settle this question. In 
the original study of Ex (Hanley, 1957), it 
was recognized that item imbalance might 
lead to contamination by response bias. For 
this reason, a second version, Sx, containing 
equal numbers of true and false responses; 
was described together with data showing 
that it correlated significantly with several 
MMPI diagnostic and validating scales. Whe? 
social desirability was ignored and all Sx 
items keyed true to give a measure of per 
sponse bias (AT), correlations were obtaine‘ 
with several MMPI keys in the predicted di- 
rection. Sx and AT scores, however, were not 
significantly correlated. From both sets of 
correlations, it was concluded that many 
MMPI measures were influenced by both 
response bias and defensiveness. 

B has correlations of .49 with AT and —.33 
with Sx, computed from the protocols of Han- 
ley’s 1957 sample. Both coefficients are sig- 
nificant at the 1% level. These results suggest 
that B is influenced by defensiveness as well 
as response bias. More direct evidence on this 
point, however, is obtained from judgments of 
the social desirability of the items comprising 
the B scale. 

Social desirability judgments were available 
for 25 B items from the earlier study (Hanley; 
1957). The remaining 38 B items, together 
with three markers that help define the ex- 
tremes and middle of a nine-point social de- 
sirability rating scale, were rated by 26 male 
and 33 female Michigan State University stu- 
dents in two undergraduate child psychology 
sections. Social desirability of an item is de 
fined by its median rating. As in the earlier 
study, items with values of 4 or less were 
categorized as undesirable and those rated © 
or more desirable, while items with medians 
between 4 and 6 were treated as neutral. 

Of 63 B items, 21 were judged undesirable; 
32 neutral, and 10 desirable. The scale. Ít 
appears, has an imbalance of socially undesi!- 
able items. 

Another way to demonstrate the imbalance 
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TABLE 2 


INTERNAL CONSISTENCY RELIABILITY 
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Sex able Ite 
100 Men -576 13 -370 
68 Women -053 - -397 .217 
Reliability corrected to length of 3: 
100 Men -674 -513 
68 Women TH .397 


is to consider reliable variance. If the hypothe- 
sis is correct, undesirable and desirable items 
will be influenced by two sources of syste- 
matic variance: social desirability and re- 
sponse bias. Neutral items will be unaffected 
by social desirability. Neutral items, therefore, 
should contribute less variance to the B scale, 
provided allowances are made for difference in 
number of items involved. 

To test this hypothesis, the B scale was 
broken into homogeneous subscales of unde- 
sirable, neutral, and desirable items. Kuder- 
Richardson Formula 20 reliabilities computed 
for the three subscales are shown in Table 2. 
The Ss were 100 males and 68 females, who 
in 1955 had taken the MMPI in introductory 
psychology classes at Michigan State Uni- 
versity. y : 

Empircial reliabilities are given in the upper 
half of Table 2. Since the subscales differ 
markedly in length, these values must be cor- 
rected to make comparisons meaningful. We 
ask, therefore, what reliabilities would be ex- 
pected if all subscales consisted of 32 items. 
The entries in the lower half of Table 2, 
computed by the generalized Spearman-Brown 
formula (Guilford, 1954, p. 354), answer this 
question. The desirable and undesirable items 
have greater internal consistency than the 
neutral ones, a result in agreement with the 
hypothesis that these two subscales contain 
variance associated with social desirability. 

Social desirability in B can be shown in yet 
another way. B has an internal consistency of 
.628 in these women and .647 in the men. By 
keying responses to the 10 desirable items 
false and scoring all others true, the role of 
social desirability is increased at the expense 
of response bias. The internal consistencies of 


the revised measure are .673 for the women 
and .640 for the men, results again demon- 
strating that B is affected by social desir- 
ability. 

The conclusion that rational validities of 
certain defensiveness measures should be con- 
sidered the result of response bias must be 
strongly qualified whenever it is based on 
correlations involving B. To devise a satis- 
factory measure of the hypothetical general 
response bias to inventory items, one should 
use judgment of content to eliminate an im- 
balance in socially desirable and socially un- 
desirable items. 


EMPIRICAL VALIDATION 


The third aim of the present study concerns 
the extent to which validation of the kind em- 
ployed by Wiggins risks incorporating vari- 
ance specific to the procedure. Such variance 
will be irrelevant to defensiveness as it occurs 
in diagnostic and screening situations in real 
life. A clue to one type of such specific vari- 
ance is given by the fact that the L scale is 
one of the more effective measures in Wiggins’ 
study. A likely source of such specific variance 
arises in items that are obviously measures of 
defensiveness and whose keying as such is 
transparent. An “obvious” item, to borrow 
Wiener’s (1948) term, is one a sophisticated 
individual will recognize as a trap for defen- 
siveness.1 The L scale is thought to suffer from 
such obviousness: “At least, one may conclude 
that the intent to deceive is not often detect- 
able by L when the subjects are relatively 
normal and sophisticated” (Meehl & Hath- 
away, 1946, p. 538). Obviousness, however, 
is undesirable only if keying of responses is 
transparent. When defensive Ss identify an 
item as pertaining to defensiveness, but think 
that the nonkeyed response is the critical one, 
the item remains effective. There may exist, 
therefore, obvious items worthless in practical 
use because their scoring is transparent, obvi- 
ous items useful because their scoring is dis- 


1 An item may be “obvious” for some purposes but 
“subtle” for others. In Wiener’s study, for example 
the item “I am happy most of the time” is considered 
a subtle measure of Pa but an obvious measure of 
Hy. Obvious defensiveness items, in the same way. 
may be subtle on other scales. i 
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TABLE 3 
Proportions OF ITEMS RECEIVING DIFFERENT 
NUMBERS OF “OBVIOUS” JUDGMENTS 
FROM 48 JUDGES 


Number of “Obvious” 
Judgments Received 


8-0 


Scale 36-23 21-16 15-9 
L 40 33 20 07 
Sd 30 22 20 28 
Cof 26 29 18 26 
Ex 19 42 .27 12 
K 13 30 30 27 
All Items 25 24 26 24 


guised, and items so subtle that scoring is no 
issue. 

In validating procedures of the instructed 
vs. standard groups variety, obvious-transpar- 
ent items are as effective as any others in dis- 
tinguishing between instructed and control Ss. 
Ss asked to give the socially desirable answer 
or to fake a role should do so with obvious as 
well as subtle items. Controls taking the in- 
ventory under standard instructions should 
avoid defensive answers to obvious-transpar- 
ent items. A scale derived from comparison of 
the two groups ought to be effective in similar 
investigations, but many of its items may 
prove useless in real life applications. For this 
reason, performance in Wiggins’ study alone 
is an unsatisfactory standard against which to 
judge various methods of constructing meas- 
ures of test taking defensiveness. 

To determine proportions of obvious items 
in the empirical and rational scales, 18 male 
and 30 female students in the sections used 
3 weeks earlier for judgments of the B scale 
each selected the 30 to 40 items most obvi- 
ously measuring defensiveness from a list of 
the 103 items on K, L, Ex, Conf, and Sd. 
Next, they gave the defensive answer to every 
item they had chosen. From their choices 
come two kinds of information: obviousness of 
each item and transparency of its scoring. 

Obviousness of Items. Data on obviousness 
appear in Table 3. The 103 items, several of 
which occur on more than one measure, are 
grouped very nearly into quartiles according 
to the number of obvious judgments received 
from the 48 judges. The L scale clearly is 
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composed of a relatively large number of ob- 
vious items. K, on the other hand, is least 
obvious, a finding that supports the authors 
of the MMPI in their belief that K is the 
subtler measure. 

The other three measures fall between the 
two extremes. The empirical Cof and Sd scales 
have a greater proportion of extremely ob- 
vious items than the rational Ex measure. I 
the first two columns of Table 3 are pooled, 
however, the advantage for Ex disappears. At 
this point, data on judges’ agreement on the 
defensive answer are relevant. Of the 52 items 
receiving 16 or more judgments of obvious: 
answers to 8 were confused to the extent that 
one-fourth or more of the judges disagreed 
with the majority. One Cof and four Sd item 
were subject to such extensive disagreement, 
but the majority of judges in each case eno 
the keyed response. One K item was so 4 
fected, but the majority answer was ne 
Of three Ex items disagreed on, only one va 
answered in the keyed direction by the ma 
jority. K and Ex, it seems, are even less 4 
fected by item obviousness than Table 3 in- 
dicates. 3 d 

Transparency of Scoring. Most items judge 
obvious were answered by the average judge 
as keyed on individual scales; nevertheless) 
Some were answered in the nonkeyed directio” 
(i.e., judges thought the “honest” answer wa5 
the defensive one). For a systematic study 0 
this behavior, responses to all items receiving 
nine or more judgments of obvious were an- 
alyzed—those with fewer are so subtle that 
transparency is no issue. 

The results of this analysis may be e% 
pressed by a ratio of incorrect to correct aver” 
age identifications of the keyed response. with 
K, for example, the ratio is 7/15—7 incorrect 
and 15 correct identifications out of 22 items 
receiving 9 or more obvious judgments. Ratios 
for the other scales are: Ex 5/18, Cof 1/24 
Sd 3/26, and L 0/15. These data demonstrate 
that the most effective scales in Wiggins’ study 
tend to be more transparent in scoring tha? 
is the case with those he found less useful. 

Subtle Items. While the empirical metho 
used by Wiggins and by Cofer et al. is prone 
to include items undesirable in a measure 0 
defensiveness, the data in Table 3 indicaté 
that it uncovers a fair number of subtle items 
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Prominent among those falling into the lowest 
quartile of obviousness are items whose con- 
tent includes the words “I like.” Some 15 of 


the 103 items contain this expression, and 10. 


of these are among the subtlest. An example 
is the item: “I would like to be a soldier.” 
Five subtle “like” items appear on Cof, six on 
Sd, one on K, and none on L and Ex. Lorge 
(1937) discovered response bias on the like 
items on the Strong VIB, and Hanley (1959) 
has presented evidence of a set specific to such 
items rather than to items in general. A de- 
fensiveness measure with many like items can 
be affected by individual differences in the 
specific response set. For this reason, subtle 
like items may be undesirable in a measure 
of defensiveness. A . 
Despite proneness to a specific set, like 
-items may prove useful in the screening and 
diagnostic situations. It is possible that Ss 
attempting to portray themselves in an overly 
favorable manner tend to like almost every- 
thing. If this is true, there is no objection to 
the use of such items, save for the reservation 
expressed above. 


DISCUSSION 


The results of Wiggins’ study show high 
empirical validity for the Sd and Cof meas- 
ures. Rather than presenting his findings as 
only a validation and cross-validation of these 
particular scales, Wiggins has taken the more 
constructive path of raising general methodo- 
logical questions that relate to all measures of 
social desirability. The danger arises that the 
success of his scale may lead to an uncritical 
acceptance of the methodological analysis in 
which he employs the concepts of empirical 
validation and response bias to account for 
different efficiencies and correlations in his 
comprehensive sample of defensiveness meas- 
ures. By use of his own extensive data, how- 
ever, it has been possible to show that method 
of validation is less systematically related to 
effectiveness than is selection by attention to 
item content and response frequencies. The 
relatively ineffective measures in his study, 
the empirical K and the rational SD scales, 
lacked one of these two selection criteria in 
their construction. i , 

Wiggins properly points to the possible con- 
tamination of rational scales by response bias, 
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but an empirical method has produced one 
scale, K, that is probably so affected. New 
data on defensiveness in the B scale, which he 
used to measure response bias, indicate that 
it is affected by social desirability. While con- 
struction of rational measures certainly should 
aim at balancing items to eliminate response- 
bias contamination (Hanley, 1957), the em- 
pirical methods employed by Wiggins and by 
Cofer et al. appear from data presented in this 
paper to suffer the limitation of including 
items that may be too obvious to be useful in 
real life measurement of test taking defen- 
siveness. 

The data that indicate excellent discrimina- 
tion for the empirical Sd and Cof measures 
show good discrimination for the rational L 
and Ex scales. The L scale is short, and the Ex 
measure was originally presented as a meth- 
odological demonstration rather than as a 
practical instrument (Hanley, 1957). In view 
of the success of these four scales, it should 
be emphasized that both empirical and ra- 
tional methods can work in the contrasted 
groups’ approximation to the real life situation. 
(Correlations among these measures in Wig- 
gins’ sample of control men demonstrate that 
Sd and Cof do not form a pair clearly distin- 
guished from the other measures. Cof and Ex, 
for example, correlate more highly than Cof 
and Sd, despite a 14-item overlap in the latter 
two scales.) 

Validation by contrasted groups, however, 
remains only an approximation to screening 
and diagnostic performance. For this reason, 
Wiggins’ results do not foreclose the possibil- 
ity that a seemingly ineffective scale may be 
useful in actual practice. For any defensive- 
ness measure to aid in screening and diagnosis, 
moreover, one condition must be met: if a 
linear regression model is used, the defensive- 
ness scale must correlate with the diagnostic 
measure, that is, a scale cannot suppress irrel- 
evant variance in a predictor unless it corre- 
lates with the predictor. Rational scales seem 
more to meet this requirement than do em- 
pirical measures, save for K. The correlation 
of —.091 between Sd and Sc reported by 
Wiggins, for example, means that Sd cannot 
suppress variance in Sc that is associated with 
defensiveness. While it is possible to hold that 
the role of defensiveness in responses to in- 
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ventories is seriously exaggerated—that the 
low correlation is the outcome of honest an- 
swers to Sc items—so many psychologists have 
assumed defensiveness operates to reduce the 
effectiveness of inventories that this plausible 
but radical hypothesis needs verification in 
studies of protocols obtained from patients 
and controls in screening and diagnostic set- 
tings. Until then, no method can rightly be 
termed “the method of choice.” 

The derivation of useful suppressor scales is 
not the only concern in studies of social de- 
sirability; there ought to be some explanation 
of failure and success. The methodological 
considerations raised by Wiggins and by the 
present paper are important for this reason. 
Wiggins indicates what he believes is the im- 
portant dimension; the present paper presents 
alternatives that fit his results. The final reso- 
lution of these differences, however, rests on 
studies as extensive as that of Wiggins but 
conducted with actual rather than simulated 
protocols. The methodological analyses indi- 


cate dimensions that should be explored in 
such a study. 


SUMMARY 


MMPI scales related to social desirability 
differ in use of response frequencies and atten- 
tion to item content in selection of individual 
items. A reinterpretation of data from an 
extensive study by Wiggins (1959) indicates 
that scales using both response frequencies 
and judged item content in their construction 
are superior at discriminating controls from 
subjects instructed to respond to the MMPI 
in a socially desirable manner. Whether scales 
were originally validated by the “empirical” 
or the “rational” method is less systematically 
related to their effectiveness. 

The role of response bias in producing cor- 
relations between rational scales and MMPI 
diganostic measures is unclear because of the 
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difficulty of obtaining a pure measure of re- 
sponse bias. The B scale employed by Wiggins 
to measure response bias is also influenced by 
social desirability. 

Derivation by contrasted groups, the em- 
pirical method used by Wiggins, suffers from 
the fact that it includes many items that are 
obvious measures of defensiveness and whose 
scoring is transparent. 

Preference for empirical or for rational pro- 
cedures should await studies of their effective- 
ness in real life diagnostic and screening 
situations. 
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AN ANALYSIS OF .FIGURE ROTATIONS 


H. BIRNET HOVEY? 


Veterans Administration Hospital, Salt Lake City, Utah 


When a patient produces a rotation or re- 
versal for a stimulus figure during a psycho- 
logical test, this is interpreted as a sign of 
brain damage. For instance, higher weighted 
scores for brain damage are assigned for rota- 
tions than for any other kinds of errors in the 
Graham-Kendall (1946) Aleme area 
Test (G-K). During routine a o 
that test at this hospital, it has been pon 
that an occasional intelligent patient pro = 
a rotation without making any other a 
error. A correctly reproduced figure, soe 
rotated or reversed, seems on inngpection, a 
indicate better cerebration than a repro ii 
tion in which the gestalt is changed orio er 
kinds of errors are made for the ap mre 
So, since brain damaged persons pro 3 5 a 
tions to a significantly greater TRR: ana 
others, it was hypothesized that suc i a a 
might have significance also in a i 
Could it represent an element š "A ing, : 
hysterical maneuver, a form of ro ei i 
overdemonstrate brain injury, ss i Te 
lack of interest, distraction, Eer p yio 
logical disturbances 1n the brain? np o 
of a few handy cases with rotation suggeste 


the last as a likely possibility. 


PROCEDURE 


ts who had had electroencephalo- 


Of the patien tocols at this hospital 


grams (EEGs) all the G-K pro 


rani at the University of Mich- 
ia Kenh A ool, fel oT to the author com- 
ears MeN EG data which were used in this study. 
mea awal research psychologist at the Veterans 
Reed a tion Hospital in Salt Lake City, rechecked 
n based on rotations made by the original 
E ale nd gave useful suggestions. Leonard W. 
examine apjat of the Neurology Service of this hos- 
Jarcho; d Chanan of the Division of Neurology of 
pital, ae ity of Utah College of Medicine, critically 
E et the results and interpretations and made 
helpful suggestions for the manuscript. 


containing rotations or reversals were collected to 
find out if a patterning of some kind might emerge 
when these protocols were compared as a group with 
other data in the clinical files. Almost at once the 
observation was made that there may be a loading 
for epilepsy. There were 42 protocols with rotations 
or reversals among a total of 338 protocols for pa- 
tients who had also had EEGs. 

Between 25-50%, varying from month to month, 
of all patients admitted to this hospital since it 
opened in 1952 have received EEGs. Cases have been 
referred primarily for EEGs when shock therapy was 
contemplated, when epilepsy or brain damage was 
considered a possibility or was known, when patients 
were regarded as alcoholics, as elderly psychotics, as 
special diagnostic problems, etc. When a patient was 
referred for both an EEG and a psychological evalua- 
tion, both referrals were usually made at about the 
same time. 

All the protocols were separated into two major 
groupings based on EEG summary impressions of the 
electroencephalographers. The criterion for the first 
grouping was “normal” or “within normal limits.” 
This contained 129 cases and was set aside. The 
second major grouping, containing 209 cases, was 
broken down into two groups for the analysis. All 
had “abnormal” or “borderline abnormal” records 
such as “generalized slow patterns,” or “slow alpha,” 
ete. When, in addition, mention was.made of the 
presence of transient episodic disruptions of the ab- 
normal background or usual patterns, the case was 
placed in Group I. Notations for these cases were 
such as “paroxysmal formations,” or “scattered sharp 
formations,” or “spikes,” or “occasional delta,” etc. 
This group contained 82 cases, Group II was com- 
posed of the 127 cases with abnormal EEGs but 
without any notations of transient disturbances. 


RESULTS 


When Group I was compared with Group II 
for frequency of rotations, the x? was 19.67, 
which is beyond the .001 level of confidence. 
The ri was .57. When Group I was compared 
with the normal group or a combination of 
this and Group II, the relationship was higher, 
as there were only three cases in the normal 


grouping with rotation. Group I contained 28 
such cases. 
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When the cases were rearranged according 
to diagnoses and without regard to EEG find- 
ings, relationships turned out to be appre- 
ciably weaker. A group of 77 patients with the 
diagnosis of epilepsy was compared with a 
group of 136 with the diagnosis of brain 
damage, for incidence of rotation. The x? fell 
down to 6.47, which reaches only the .02 
level of confidence, and the 7p: was only .33. 


Discussion 


One explanation for the drop in relation- 
ships when comparisons were made according 
to diagnoses, is that often an “organic” has 
seizures of some kind without any mention of 
this fact in the diagnosis, perhaps because the 
seizures are regarded by a diagnostician as a 
minor symptom. Another is that seizures may 
not have been observed by trained profes- 
sional personnel. Still another is that a patient 
may never have had a recognized seizure. At 
any rate, the data demonstrate that rotations 
or reversals of G-K figures are associated with 
transient, episodic EEG disturbances, and 
they suggest that rotations may be a sign of 
epilepsy or a potential epileptic condition, 
possibly a subclinical manifestation. 

An interpretation of rotation could be as 
follows: the subject correctly perceives the 
stimulus figure, but by the time he starts to 
draw the reproduction, some kind of transient 
physiological dysfunction in the brain has oc- 
curred, altering his memory of it. There were 
five rotation cases among the whole series, 
each with a total error score of 3 on the G-K, 
these scores being based on one rotation and 
with no other scorable errors. This would sug- 
gest that each Person correctly perceived all 
the figures, but had a crucial interference of 
vigilance while negotiating one of them. 

Frequently an epileptic patient, while doing 
the G-K test, might announce that he had 
forgotten the design. Alternatively, he might 
reproduce it incorrectly and then do it again 
correctly without Prompting. So far tabula- 
tions for these behaviors have not been made 
here. When confronted with an incorrect re- 
production after the test was over, an epileptic 
subject frequently recognized the error and 
explained it in terms of momentary confusion, 
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or would say that he had been distracted by 
thinking of something else, or was not paying 
enough attention. Prior to the confrontation, 
most subjects when asked could correctly 
point out reproductions which were inaccu- 
rate, suggesting an awareness of transient con- 
fusion. Now and then a subject might make a 
rotation or a marked error in an otherwise 
accurate record, and when asked after the 
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test to find the error, would not only locate it 


but reproduce it correctly without reviewing 
the stimulus card. 

Somewhat comparable phenomena in exam- 
inations with the Wechsler Adult Intelligence 
Scale have already been reported (Hovey & 
Kooi, 1955; Kooi & Hovey, 1957). MMPIs 
administered to most of the same subjects 1" 
those studies also tended to produce character- 
istic profiles for epileptics (Hovey, Kooi, 
Thomas, 1959). pae 

The majority of the diagnosed epilepi 
group and also the majority of the group a 
episodic EEG features had known brain a 
age. Only 2 of the 42 rotation cases i 
neither an abnormal EEG nor a diagnos! 
implying organicity. 

ek Spivack, & Levine (1960) report 
that rotation of Bender-Gestalt figures bY 
children was slightly associated with haga 
abnormality but not enough for predich ia 
purposes. The difference between my resu E 
and theirs could be explained by the indio 
that children generally have less ability thai 
do adults to execute the drawing of ce 
(Pascal, 1951, pp. 23, 42). Therefore cops 
groups of children might be expected to hawi 
a relatively high incidence of rotation. T 
adult groups used in the present study com 
tained much smaller proportions of rotatio® 
than did theirs, approaching the vanishing 
point for subjects with normal EEGs. Further 
more, the present project used 45° instead pE 
their 30° criterion for rotation. However, di- 
rect comparisons between the two studies ca? 
not be made since standard administration 0 
the Bender figures permits continuous refer 
ence to the stimulus figures, whereas a memo 
factor is involved in the G-K administratio”. 
The current observations are consistent wit 
the prevailing opinion that rotation is asso” 
ciated with organic disease of the adult brai” 
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SUMMARY 


The performance on a design reproduction 
test of a group of patients with transient 
episodic EEG features was compared with a 
group having abnormal EEGs but without 
observed episodic features. The episodic group 
made rotations to a significantly greater extent. 
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ROLE PLAYING IN ACUTE AND CHRONIC 
SCHIZOPHRENIA `+ 
BERNARD L. BLOOM ABE ARKOFF 
AND 


Hawaii State Hospital 


Recent résearch has suggested that role 
playing or empathic ability is related to gen- 
eral adjustment. Working with college popula- 
tions, a number of investigators (Dymond, 
1948, 1949, 1950: Lindgren & Robinson, 
1953; McClelland, 1951; Norman & Ains- 
worth, 1954) have found that better adjusted 
students play roles and empathize with greater 
facility than those who are less well adjusted. 
By logical extension it might be assumed that 
“normals” generally are more skilled in this 
function than neurotics and psychotics. Some 
very recent studies, however, have shown that 
certain schizophrenic groups have considerable 
role playing skill, 

Jackson and Carr (1955) reported, for ex- 
ample, that their normal controls demon- 
strated greater empathic ability than schizo- 
phrenic patients; their schizophrenic sample, 
however, was quite heterogeneous, some pa- 
tients consistently demonstrating more em- 
pathic ability than a number of controls. 
When Helfand (1956) compared the empathic 
ability of four groups—normals, nonpsychotic 
patients (tuberculous), and chronic and priv- 
ileged schizophrenics—privileged schizophren- 
ics proved to be superior to all others includ- 
ing normals. Some related information was 
produced by Grayson and Olinger (1957), 
who found that when asked to simulate “nor- 
malcy,” most of their psychiatric patients 
(largely schizophrenics) were able to improve 
their test performance and that improvement 


was related to early discharge from the hos- 
pital. 


1 This investigation was supported by a research 
grant (M-1529) from the National Institute of Men- 
tal Health, National Institutes of Health, United 
States Public Health Service, 

Presented, in part, at the meetings of the Western 
Psychological Association, San Diego, April 17, 1959, 


University of Hawaii 


The present study attempted to throw mi 
ther light on role playing in schizophrenia. be 
the basis of previous research, the followin& 
hypotheses were formulated: ble 

1. Acutely ill schizophrenics are better a i 
to play the normal role than chronically 1 
schizophrenics. M 
ag acutely or chronically ill, a el 
ophrenics who subsequently improve are ee 
ter able to play the normal role than tho 
who do not. 


METHOD 
Subjects 


The subjects (Ss) of the study were 54 hospitalized 
women diagnosed as either acute or chronic et 
ophrenics. Each of these two groups was me 
divided into fast and slow improvement subgrouk 
Half of the total sample was Caucasian. The ca 
mainder was Oriental or part-Hawaiian with t 
majority being Japanese. i a: pee 

The acutely ill arole was made up of 25 patiente 
none of whom had a history of prior hospitalizaty 
beyond that associated with usual commitment pe 
cedures. In addition, suddenness of onset in a vi 
viously compensated personality structure was Ree 
used as a criterion which determined inclusion in apa 
group. Within the acute group, 12 patients were € nd 
sidered to be in the fast improvement subgroup non 
13 in the slow improvement one. This classifica a 
was based on an evaluation of the hospital hae 
over the 6 months following the testing of the ibe 
case. All of the patients of the fast improvement o ; 
group had been discharged as improved or e E 
Their range of hospitalization was from 1 Jow 
months with an average of 2.6 months. In the “ise 
improvement subgroup, seven patients had been 3 
charged; six remained in the hospital. Length ths 
hospital stay for these Ss ranged from 6 to 20 mon 
with an average of 10.9 months. In general, roc 
in the slow improvement subgroup was not only = 
rapid, but it was also qualitatively less striking: ho 

The chronic group was made up of 29 patients W d 
had either been continuously hospitalized for 2 
more years or had been admitted at least twice an 
had a history of long-standing schizophrenic adjus 
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ment. Within this group, 13 patients were placed in 
the fast improvement subgroup and 16 in the slow 
one. All of the patients of the fast improvement sub- 
group had been discharged as improved. Their length 
of hospitalization (including all former periods) 
ranged from § to 65 months with an average of 29.2 
months. In the slow subgroup, no patients had been 
discharged or seemed ready for even preliminary con- 
sideration for discharge. Length of hospital stay for 
these Ss ran from 12 to 180 months with an average 
of 56.3 months. 

Prior to inclusion in the study, patients under con- 
sideration were administered the Vocabulary subtest 
of the Wechsler-Bellevue, Form I, and only testable 
patients with a weighted score of seven or higher 
were used. None of the groups or subgroups differed 
significantly from others on vocabulary score or 
amount of formal education. The average number of 
school grades completed was 10.5. The chronic groups 
averaged 34 years old; the acute groups’ average age 
was 30.5. : 


Procedure 


In studies of role playing and empathy, the S is 
usually asked to predict the response of another 
person who is in some way known to the S. This 
procedure has been criticized by a number of investi- 
gators. Hastorf and Bender (1952) emphasized that 
projection rather than empathy may account for part 
of the prediction of another person's responses. Lind- 
gren and Robinson (1953) pointed out that instead 
of truly empathizing, S may respond in terms of a 
stereotype; and Helfand (1956) indeed found that 
his normals tended to rely on a conventional frame of 
reference although this was not true for his schizo- 


phrenic groups, who were deficient in such a ref- 


erence. 7 
Some investigators have explicitly undertaken to 


assess their Ss’ awareness of normative data rather 
than their awareness of a specified criterion individ- 
ual, Indeed, Crow (1959) suggested that when judges 
are asked to predict personality characteristics of 
criterion Ss, their judgments are more accurate if they 
are based upon stereotypes than if they are based on 
specific information about each criterion S. In Crow's 
study, a variety of judges (student nurses, medical 
students, psychiatric residents, and others) were 
asked to estimate the age, intelligence, vocabulary 
level, and personality characteristics of 10 medical 
patients, based upon their seeing a 6-minute sound 
movie of each patient being interviewed by a physi- 
cian. In addition, the judges were asked to make 
estimations for the “average patient.” On the basis of 
these two kinds of judgments, it was possible to com- 
pute a stereotype accuracy score (subtracting a 
judge's estimation for the average patient from each 
of the criterion scores) and an individual accuracy 
score (subtracting a judge’s estimation for each pa- 
tient from that patient’s criterion score). Crow found 
that stereotype accuracy was clearly more accurate 
for estimation of personality characteristics; that is, 
the judges would have been more accurate if they 
had given their estimation for the average patient 


each time instead of making an individual prediction 
for each patient. 

The procedure used in the present investigation was 
similar to Crow's and to the one used by Grayson 
and Olinger (1957) in that awareness of normative 
data was measured, that is, “abnormal” Ss were asked 
to play or simulate the “normal” role. Each S was 
tested in two sessions, the first session in the morning 
and the second in the afternoon of the same day. In 
the case of the acutely ill group, the testing was ac- 
complished within 1 week after hospitalization. In the 
ñrst session, the Rorschach and the Sc scale of the 
Minnesota Multiphasic Personality Inventory were 
given under standard instructions. In the second ses- 
sion, these two tests were administered again with 
special role playing instructions to the S to respond 
in the way that a “typical, average, ordinary” per- 
son would. The instructions were repeated with each 
Rorschach card and wherever it seemed indicated on 
the MMPI Sc. The word normal itself was not used 
because preliminary investigation revealed that this 
term provoked negative reactions on the part of 
some patients. 

Each Rorschach protocol was scored for the “prin- 
cipal indicators of schizophrenic disorganization” de- 
scribed by Schafer (1948). These indicators include 
low form level, use of pure color, sex responses, sud- 
den changes, irregular sequence of locations, and vari- 
ous types of deviant verbalizations. Every indicator 
was given a score of one point each time it appeared 
except that F+% between 50 and 59 was given a 
score of 1, and under 50 a score of 2. For each proto- 
col, the total number of indicators was divided by 
the number of responses to yield a schizophrenic 
disorganization quotient which took into account 
the productivity of the S. Statistical analysis of the 
Rorschach results was based on this quotient. The 
MMPI Se was scored in the usual manner, 


RESULTS 


The Rorschach schizophrenic disorganiza- 
tion indices and the MMPI Sc scores for the 
various groups under the two experimental 
conditions are presented in Table 1. On both 
the Rorschach and MMPI Sc, high scores 
were regarded as evidence of schizophrenia 
and reduced scores under role playing in- 
structions were considered to be evidence of 
ability to play the normal role. 

On the Rorschach, all experimental groups 
showed some reduction in sign of schizo- 
phrenic disorganization under role taking 
conditions. The acute group significantly es 
duced its quotients, demonstrating a de- 
creased schizophrenic disorganization when 
playing the normal role. The fast improve- 
ment group similarly reduced its quotients 
However, the degree of reduction of the signs 
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TABLE 1 


INDICES OF SCHIZOPHRENIC DISORGANIZATION ON THE Rors 


CHACH TEST AND MMPI Se SCALE 


UNDER STANDARD AND ROLE PLAYING INSTRUCTIONS 


Rorschach Test 


Disorgan- 
Disorganization ization 
Indicators Quotient MMPI Se Scale 
icin A 
x t 
g a; Tez S 
Mean SD E Mean SD Ei A Mean SD i Ea. 
Acutely Ill (Ņ = 25) 
Standard 11.36 14.00 76 .90 32.96 17.52 5 5 
Role Playing 5.16 5.48 .32 36 42 2.75+* 30.72 18.40 34 ( 
Chronically I] (N = 29) 
Standard 8.93 9.17 76 .82 22.28 11.90 00 
Role Playing 7.34 8.28 70.78 85.67 25.07 15.48 A Ml 
Fast Improvement (N = 25) 
Standard 8.08 7.36 -67 .66 27.48 16.64 <- 96 
Role Playing 4.00 5.04 33.33 AS 2.43* 4.44 16,28 a 
Slow Improvement (N = 29) 
Standard 11.76 14.21 84.99 27.00 14.83 96 
Role Playing 8.34 8.14 69 .80 .76 1.25 30.48 17.34 30 


ž Significant at .02 level, 
Significant at ‘01 level. 


of schizophrenic disorganization in the chronic 
Sroup and in the sl 


was slight and did not achieve significance, 


scores on the sam 
conditions, 


The statistical tests of the two experimen- 
tal hypotheses are Presented in Table 2, Al] 


TABLE 2 


ANALYSIS OF VARIANCE OF DIFFERENCES BETWEEN 
Test PERFORMANCE UNDER STANDARD AND ROLE 
PLAYING INSTRUCTIONS 


Rorschach MMPI 
Se ee 
Source of Variance df MS PF MS F 
Chronicity of Illness 1 2.04 4.57* 349.09 1.07 
Rapidity of Improvement 3 OSs Dia 592.24 1.82 
nteraction 1 0.26 0.59 180.08 0.55 
Within Groups 50 22.33 16,253.83 


Kp <05; 


of the results are in the predicted directio”, 
but only one achieved statistical significance. 
The acute group improved its Rorschach per 
formance significantly more in playing the 
normal role than did the chronic group. 


DISCUSSION 


Studies of changes in Rorschach protocols 
between two test administrations in the ab- 
sence of instructions to play a specified role 
Suggest that signs of psychopathology visible 
on the first protocol are equally clear on 
second. Griffith (1951) tested a sample 0 
four patients with a diagnosis of Korsako 
syndrome whose memory was sufficiently im- 


paired so that they did not recall the first test 
situation whe 


day afterwards 


and that this 
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trations. Significant statistical changes could 
not be demonstrated for any Rorschach fac- 
tor from test to retest. They suggested that 
chronic schizophrenics are highly consistent. 
These studies indicate that one can reason- 
ably expect reliable performance from test to 
retest even within a psychotic population in 
the absence of special role taking instructions. 
The description of the schizophrenic as a 
person lacking a concept of a “generalized 
other,” offered by Sarbin (1943) and elabo- 
rated by Helfand (1956), is consistent with 
the performance of the chronic schizophrenics 
of the present study. This description is less 
appropriate, however, for the acute schizo- 
phrenics since the Rorschach results suggest 
that this group has some conception of the 
normal role and can differentiate it from 
schizophrenia. This is particularly true of 
those acute schizophrenics who subsequently 
showed rapid clinical improvement. 
Comparison of the results on the Rorschach 
and MMPI indicates that the Ss of the study 
were able to reduce their schizophrenia scores 
on the former but not on the latter. It might 
be assumed that a well-structured task such 
as the MMPI Sc scale would prove more re- 
sponsive to role playing than a less structured 
one such as the Rorschach. Scores obtained 
by the experimental group were in the range 
typically found by other investigators when 
studying schizophrenics, which suggests a cer- 
tain degree of validity in the present findings 
insofar as the standard instruction MMPI is 
concerned. Sorting the 78 cards proved to be 
a long and laborious task, however, and it was 
not easy to maintain all Ss’ interest in the test 
or to ensure a proper set. Furthermore, a 
number of items were found to be worded in 
a complex and possibly confusing fashion. 
(Example: “I do not often notice my ears 
ringing or buzzing.”) Answering such items 
under standard instructions seemed difficult 
for a number of Ss; with the added operation 
required under role playing instructions, the 
difficulty seemed to be compounded. Whether 
the present results reflect these difficulties, 
which might have led to random sorting on 
the role playing task, or whether the results 
suggest a true inability of schizophrenics to 
predict how normals would respond to these 


test items, remains an unanswered question 
here. 

Because results with the MMPI in the pres- 
ent study are at variance with the results re- 
ported by Grayson and Olinger (1957), a 
sample of 15 schizophrenic women, none of 
whom had been included in the study re- 
ported here, were administered the entire 
MMPI under both standard and role taking 
instructions. An analysis of this small sample 
confirms the results found in the present 
study. No significant changes were found in 
the apparent ability of this small sample 
to reduce signs of psychopathology on the 
MMPI under role taking instructions. In a 
sample of 14 schizophrenic men tested with 
the MMPI under standard and role taking 
instructions, there was again no significant 
reduction in signs of psychopathology from 
the first to the second test administration on 
any MMPI variable. In comparing the sam- 
ple of cases used in the present study with 
the sample used by Grayson and Olinger, it 
seems possible that their sample consisted of 
cases who did not have as extensive psycho- 
pathology as the sample used in the present 
investigation. Furthermore, it is possible that 
there is a difference in general language fa- 
cility. These two differences may account for 
the different findings. The ability to take the 
normal role appears to be a variable trait 
which is partially related to severity and 
longevity of psychopathology and to subse- 
quent clinical history. Level of intelligence, 
however, especially language facility, may 
also be related to this trait. 


SUMMARY 


In the present study an attempt was made 
to throw further light on role playing in 
schizophrenia. On the basis of previous re- 
search, it was hypothesized that acutely ill 
schizophrenics would be better able to play 
the normal role than chronically ill ones, and 
that whether acutely or chronically ill, schizo- 
phrenics who subsequently improved would 
be better able to play the normal role than 
those who did not. Although the results were 
all in the predicted direction, they did not 
generally achieve high statistical significance. 
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THERAPIST—PATIENT RELATIONSHIPS AND OUTCOME 
OF PSYCHOTHERAPY 


MORRIS B. PARLOFF? 


National Tasir 


An assumption underlying most forms of 
psychotherapy is that the relationship be- 
tween the therapist and his patient 1s the ve 
hicle for therapeutic change. More specifically , 
the benefits from therapy are believed to vary 
directly with the quality of the therapist-pa- 
tient relationship (Betz & Whitehorn, 1956; 
Freud, 1949; Rogers, 1951; Snyder, 1959). 
It is frequently assumed that the fact of a 
patient’s remaining in treatment may be: in- 
terpreted as evidence of the “goodness” of 
the relationship and therefore of the prob- 
ability of an ultimately successful outcome. 
These widely held beliefs have not, how- 
ever, gone unchallenged. Eaton (1959) re- 
cently warned that a “good relationship 
may indeed interfere with therapeutic out- 
come. He stated that a “good relationship 
may help influence the client to become de- 
pendent on such help and to continue seek- 
by defeating the therapeutic 


i it,” there e 
BA of helping the client to achieve au- 


tonomy. Redl and Wineman ; (195 1) also 
pointed out the potential limitations of a 
seemingly good relationship. They stressed 
that the therapist who establishes a close, 
warm, and permissive relationship with a pa- 
tient may find himself occupying the non: 
therapeutic role of “friend without influence.” 
These contradictory viewpoints may be due 
to the fact that the investigators used differ- 
ing definitions of the concept good relation- 
ship. What is required is a more explicit defi- 
nition of the therapist-patient relationship 
concept and the systematic testing of it. To 


1 This study was conducted at the Henry Phipps 
Psychiatric Clinic of the Johns Hopkins Hospital, 
Baltimore, Maryland. The author expresses grateful 
acknowledgment to J. D. Frank, E. Ascher, H. Kel- 
man, D. Rosenthal, E. Nash, and A. R. Stone for 
their cooperation and many helpful suggestions. 
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date there have been very few experimental 

tests made regarding the association between 

favorable outcome and quality of the thera- 
pist-patient relationship (Heine, 1950; Holt 
& Luborsky, 1952; Snyder, 1953). 

There are many theoretical frames of ref- 
erence from which the concept of relation- 
ship may be viewed, yet, according to Fiedler 
(1950), these differences may readily be sub- 
sumed under one general description of the 
“ideal therapeutic relationship.” He reported 
that therapists of diverse theoretical persua- 
sions revealed a remarkable degree of agree- 
ment in characterizing the ideal therapeutic 
relationship. The very concept of the ideal 
therapeutic relationship appears however to 
violate the clinician’s belief that to be effec- 
tive a relationship must be adapted and 
modified to meet the particular needs of a 
given patient. An examination of the Fiedler 
instrument is reassuring since the descriptive 
statements are written at such a high level of 
abstraction as to encompass the relationship 
needs of most patients as well as most non- 
patients. This study, therefore, employs Fied- 
ler’s view of the therapeutic relationship to 
provide an operational definition of this elu- 
sive concept. 

This report describes an effort to test, in a 
group therapy setting, the correlations be- 
tween patient-change, remaining in treatment, 
and quality of therapeutic relationship. Since 
the definition of the construct “therapeutic 
relationship” has not been widely agreed 
upon, and no criteria have been accepted as 
valid, this study is to be viewed as a test of 
the concept’s construct validity (Cronbach & 
Meehl, 1955). It is further recognized that 
the construct validity of the outcome criteria 
employed here also may be regarded as under 
study. 
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The concept of the “therapeutic process” 
presupposes that psychotherapy proceeds in 
a discernibly systematic step-wise fashion. 
Therefore, some investigators believe that 
certain kinds of intermediate change may be 
viewed as harbingers of significant benefit to 
the patient. The investigator who accepts this 
idea ascribes to a variety of phenomena the 
status of “enabling” or intermediate condi- 
tions necessary for beneficial change. Unfor- 
tunately, the relationship between the alleged 
roadmarkers and the destination is not as yet 
established. In group therapy, for example, it 
may be encouraging to the therapist to note 
increased group cohesiveness, evidences of 
group support and stimulation, establishment 
of multiple transferences, recall of repressed 
material, resolution of transferences, etc. That 
such phenomena do not invariably eventuate 
in clinical improvement will be conceded even 
by the most ardent group therapist. The au- 
thor decided, therefore, to concentrate on cri- 
teria that related to ultimate goals, such as 
Providing symptomatic relief and improving 
social functioning, rather than intermediate 
goals. The assumption that “improvement” is 
a unitary phenomenon is questionable (Kel- 
man & Parloff, 1957). This is especially the 
case where improvement is “less than com- 
plete recovery.” This broad category unfor- 
tunately includes a considerable proportion 
of all patients treated. If, then, improvement 
cannot be discussed in global terms, it is 
necessary to specify the various criteria and 
measures. 

The three criteria of improvement adopted 
in this study are based on the work of Kogan 
and Hunt (1950) and Miller (1951). These 
criteria are: Comfort, Effectiveness, and Ob- 
jectivity. The first two criteria, Comfort and 
Effectiveness, are based on the belief that the 
general aim of psychotherapy is to ameliorate 
the patient’s suffering and to restore him to 
an effective level of functioning in the com- 
munity. Comfort was defined in terms of 
symptoms or feelings which had caused dis- 
tress. Effectiveness was defined as the degree 
of competence with which the patient man- 
aged to fulfill his own needs and desires as 
well as those of others. With the third cri- 
terion, Objectivity, an attempt was made to 
take cognizance of a value which is shared by 
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a number of quite different psychotherapeutic 
approaches. All assert that the better the in- 
dividual understands himself the freer he will 
be to react appropriately to conditions arising 
in his current life. Objectivity is not an end- 
point per se but a generally accepted means 
to an end. 

That the patient remain in treatment is a 
necessary but not sufficient condition for psy- 
chotherapy to be effective. The amount of 
time necessary for change to occur varies 
from patient to patient. The patient may re- 
main in treatment and yet fail to improve. 
Although a patient who drops out of therapy 
may have derived considerable benefit, his 
departure may preclude any objective assess- 
ment of this benefit. The factors that deter- 
mine whether a patient will remain in treat- 
ment may or may not coincide with those 
which determine whether he will improve if 
he does remain. 

In the present study it was hypothesized 
that changes evidenced in Comfort, Effective- 
ness, and Objectivity are related to the qual- 
ity of the therapeutic relationship. It was 
also hypothesized that remaining in treatment 
is similarly a function of the goodness of the 
relationship. 


METHOD 


The eight instruments used in this study to meas- 
ure the criteria (Comfort, Effectiveness, and Objec- 
tivity) will be described only briefly. A fuller de- 
scription may be found elsewhere (Kelman & Parloff, 
1957).? Since the therapy goals to be reported were 
the amelioration of discomfort and the modification 
of ineffectual behavior, the scales measuring Comfort 
and Effectiveness were reversed to measure instead 
the degree of “Discomfort” and “Tneffectiveness.” 

Judgments regarding the patient’s Discomfort, In- 
effectiveness, and Objectivity were made by research 
teams composed of a psychiatrist, social worker, and 
Psychologist. Specially designed “evaluation” scales 
were filled out independently by each member of 
the research team in describing each patient. These 
judges then met to discuss their ratings and to arrive 
at an overall single staff rating for the scale measur- 


2 Copies of measures used in the evaluation of PSY- 
chotherapy have been deposited with the American 
Documentation Institute. Order Document No. 6464 
from ADI Auxiliary Publications Project, Photo- 
duplication Service, Library of Congress; Washing- 
ton 25, D. C., remitting in advance $1.75 for micro- 
film or $2.50 for Photocopies. Make checks payable 
to: Chief, Photoduplication Service, Library of Con- 
gress. 
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ing each criterion. In addition to these staff ratings, 
two measures of Discomfort were obtained from the 
patients, one measure of Ineffectiveness was derived 
from ratings made by group members of each other, 
and two measures of Objectivity were obtained by 
(a) comparing the patient’s self-description with an 
independent staff observer's description of him, and 
(b) determining the accuracy of the patient’s predic- 
tions of the ratings he would receive from each of 
his fellow patients. The staff members who partici- 
pated in completing the staff evaluation scales did 
not act as judges of the therapeutic relationship for 


the same patients except in two cases. 


Measures of Discomfort 


1. Self-Satisjaction Q sort (Patient): This at- 
tempted to tap the degree of congruence between 
the patient’s perceived self and ideal self regarding 
behavior in the group therapy situation. A 60-item 
“perception” Q sample was employed. It is based on 
Bion’s (1950) group interaction concepts. ey 

2. Symptom Disability Checklist (Patient): This is 
a modification of the Cornell index. Forty-one — 
referring to psychic or somatic complaints were ee 
by the patient in terms of the relative distress y 
caused him during the week preceding testing. 

3. Discomfort Evaluation Scale (Staf): ae 
sists of items describing 10 areas of interperson = 
comfort, The scale was filled out independent od 
each patient by three staff raters: re te T 
worker, and psychologist. These judges at ea 
discuss the ratings and to agree on an over FEN 
staff rating on the basis of their combined cl 


judgment. 


Measures of Ineffectiveness 
Scale (Fellow Pa- 
h of the other pa- 


1. Ineffectiveness En 
tients): Each patient rate i ions: 
ERRA ron tienes group on three cer sage a 
extent to which the rater respected we =e Veider 
ideas and opinions, regarded him aes a Re each 
and desired to be friends with him. 3 ae aed were 
dimension were made on 2 four-point sca h Leader- 
reported individually as measures of pests ‘yating a 
ship, and Friendship. The average Overa also com- 
patient received on the three measures ‘was 
puted. : i 

2. Ineffectiveness Evaluation spe een. og 
scale consists of 15 items in w oF “ot social roles 
tivity, productivity, and gulpen e of appro- 
are rated. The ratings Concern the he uency with 
Priateness of the behavior and the on nt persons 
Which it occurred in relation to signi! aa above 
in the patient’s home and community i completed 
mentioned staff members independenty ified staff 
this form, On the basis of a conference a um 
rating was made. 


Measures of Objectivity 
j i ient-Observer): Objec- 
1. Objectivity Q sort (Patien 
Brity was ae TOE, by the degree of congruence be- 


tween the patient’s description of his group behavior 
and the staff observer’s description of the patient’s 
behavior in the group. The Q sort items were the 
same as those used for measuring self-satisfaction. 

2. Objectivity Evaluation Scale (Patient-Fellow Pa- 
tient): In completing the group questionnaire previ- 
ously described, patients were asked to predict the 
ratings which they would receive from each of their 
fellow group members. The average discrepancy be- 
tween the ratings each patient expected from each 
fellow patient and the ratings he actually received 
was computed for each of three areas: Respect, 
Friendship, and Leadership. 

3. Objectivity Evaluation Scale (Staf): This scale 
consists of four items attempting to measure the ac- 
curacy of the patient’s perceptions of his own be- 
havior and the behavior of others. Staff ratings were 
made independently and then combined into a single 
staff rating by the conference method. 

Except for the Symptom Disability Checklist, 
which was completed prior to therapy, all initial 
testings were made immediately after the fourth 
group session. All measures were repeated following 
the twentieth group meeting. 


Drop-Outs 


Any patient who left the group prior to the twen- 
tieth session without his therapist’s approval was 
considered to have terminated prematurely. Four of 
the 21 patients were so designated. By the end of 
the experimental period each patient had attended 
an average of 9.6 sessions. The attendance ranged 
from 5 to 12 sessions.’ 


Therapeutic Relationship 


The technique developed by Fiedler (1950) was 
employed. He had the 75 statements in his Relation- 
ship Q Sample sorted by members of various schools 
of therapy. On this basis, an ideal therapeutic rela- 
tionship standard was developed. Twenty-five items 
concerned the therapist’s ability to communicate with 
and to understand the patient, 25 described the “emo- 
tional distance” between the therapist and patient, 
and the remaining 25 dealt with questions of “status” 
as reflected in the therapist’s behavior toward the 


3Prior to assignment to one of three therapy 
groups, each of the 21 patients had been “screened” 
by exposure to a 6-week orientation group. Experi- 
ence with group therapy had indicated that more 
than one-third of the patients dropped out of therapy 
by the end of the fifth session. This involved a loss 
of the time invested in initial evaluations of such pa- 
tients. The aim of the orientation group was to ex- 
pose patients to a group experience similar to that 
which they might experience in the actual therapy 
situation. It was hoped that patients who survived 
six sessions might then tend to remain in group ther- 
apy. The selection process did, in fact, act to in- 
crease the proportion of patients remaining beyond 
the fifth hour in therapy. Only one dropped out of 
treatment by the fifth hour. 
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patient. In the present study, the 75 items were used 
by the observers to describe the relationship between 
therapist and patient.* These arrays were then cor- 
related against the Fiedler ideal therapeutic relation- 
ship standard. The higher the correlation with the 
standard, the “better” the relationship. Three trained 
observers were used as judges to describe the thera- 
peutic relationship established by two therapists. 
After a preliminary practice period, the interjudge 
reliability in describing the same therapist-patient 
interaction was found to be substantial. The correla- 
tion between the relationships described by pairs of 
judges in observing 19 patient-therapist interactions 
was .92.5 One of the three judges was assigned to 
each of the three groups and attended all group 
meetings during the experimental period of 20 weeks. 
Each judge described the relationships established by 
the therapist with each patient in the group after 
the second meeting, the twelfth meeting, and the 
twentieth mecting. Each of the three descriptions 
Was correlated against the standard and the overall 
relationship was described as the mean of the three 
correlations.6 

The sample consisted of 21 Psychoneurotic pa- 
tients, 10 male and 11 female. Fourteen of the pa- 
tients were classified as “psychoneurotic disorders,” 
five as “personality disorders,” one as “psychotic dis- 
order,” and one as “transient situational personality 
disorder.” They were randomly assigned to three 
groups, ranging in size from six to eight. A treated 
13 patients, 6 in one group and 7 in the other, B 


4 Fiedler’s conce 


: pt of the ideal therapeutic relation- 
ship was derived 


primarily from experiences in indi- 
vidual .therapy. It was necessary, therefore, to deter- 
mine whether the froup therapists in the present 
study conceived of the ideal therapeutic relationship 
in a similar fashion. Each therapist was asked to de- 
scribe, by means of the 75-item Q sort, his concep- 
tion of the ideal therapeutic relationship. These ar- 
Tays were correlated with the Fiedler standard. It 
was found that A’s and B’s ideals correlated .86 and 
-88, respectively, with the Fiedler “ideal.” It was 
concluded, therefore, that the aims of these group 
therapists were sufficiently consonant with the Fied- 
ler standard to permit its use as the criterion for 
measuring the goodness of relationships. 

5 Since the data collection involved the use of in- 
dependent group observers, the question of the inter- 
rater and intrarater reliability is an important one. 
Although the reliability measures described here ap- 
pear to be adequate, data are available only for the 
initial period of the study. Since no further attempt 
to check on the reliability of the judges was made 
during the period of the study, there is no direct 
evidence that judges continued to describe the thera- 
peutic relationship in a consistent manner through- 
out the experiment. Indirect evidence on this point is 
found in the fact that the patients’ descriptions of 
their relationships with their therapists correlated 
substantially with those ascribed to them by the 
judges (rho = .79). 

© The necessary z transformations were made. 
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treated one group of 8 patients. The data reported 
are based on the first 20 sessions of each group and 
are, therefore, limited to the early period of treat- 
ment. Of the initial 21 patients, 14 completed all ex- 
perimental procedures by the close of the 20-week 
period.* 

Each group met for an hour and a half once a 
week. The form of therapy was largely interpretive 
with the focus on the immediate interaction of the 
patients with each other and with the therapist. 

The two therapists qualified as “experts” as de- 
scribed by Fiedler, i.e., each had completed pre- 
scribed training, had been a practicing therapist for 
a minimum of 5 years, and was considered an expert 
by other therapists within his school. 


RESULTS 


To determine whether improvement varies 
with the quality of the therapeutic relation- 
ship, product-moment correlations were com- 
puted between the Fiedler ideal therapeutic 
relationship scores and each of the 14 change 
scores. Inspection of the therapeutic relation- 
ship scores revealed that the four patients 
treated by B had each achieved relationships 
which were higher than any established by 


“In addition to the four drop-outs already men- 
tioned, one patient left treatment when her husband's 
job was transferred out of the city, and one failed 
to complete all evaluation procedures. Another pa- 
tient was excluded from the study when it was 
learned that he had supplemented group therapy 
with intensive individual therapy. The effectiveness 
of the group therapy relationship was, therefore, 
confounded with the individual therapy relationship. 

®To determine whether the evaluation measures 
were initially related to the quality of therapeutic 
relationships subsequently established, correlations 
were computed between initial scores on each of the 
14 measures and the overall mean therapeutic rela- 
tionship scores. The correlations obtained did not 
differ significantly from zero. The range was from 
126 to —.452. (In order for a correlation with 12 
degrees of freedom to be significant at the .05 level 
of confidence, a correlation of .532 is required.) : 

To further test whether the therapeutic relation- 
ship was associated with initial scores on these 
evaluation measures, the initial scores of the seven 
patients who had therapeutic relationships above the 
group median were compared with the seven A 
tients whose therapeutic relationship scores fell a 
low the median. None of the group differences as 
tested by the £ test attained statistical significance. 
In view of the apparent lack of association Saget 
the 14 initial evaluation scores and the quality o; 
the subsequent therapeutic relationships established, 
we were justified in computing correlations between 
therapeutic relationships and the difference scores be- 
tween the initial and final evaluation scores. 


Œ 
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the 10 patients treated by A. In effect, the 
therapeutic relationships established by each 
therapist with his patients appeared to come 
from different “populations” of relationships. 
This suggested that the influence of the thera- 
peutic relationship variable, which is at issue 
in this investigation, may. be confounded with 
other variables related to the individual thera- 
pist. 

The correlations between therapeutic rela- 
tionship and the measures of change were 
therefore computed independently for A’s pa- 
tients and for B’s patients. To obtain the best 
estimate of the true mean correlations for the 
total sample a pooled correlation (7) was com- 
puted.” As may be seen in Table 1, 3 of the 
14 mean correlations differed significantly 
r the patients of each thera- 
initial evaluation measure 


t was computed for 
differences 


9 To determine whethe! 
pist differed on their 
scores, a Mann-Whitney U test w: 
each of the 14 measures. No significant 
Were found, 
es 


TABLE 1 

PooLeD MEAN CORRELATIONS BETWEEN oe 
MEAN THERAPEUTIC REA 
(Computed for 10 Patients Treatec 

by A and 4 by B) 


NGE AND 


__ 


Pooled 7 
(V=14 
Measure 


Discomfort as 
i .037 
A. Self-Satisfaction O sort (Patient) 

| Symptom Disability Checklist 


(Patient) 
. Discomfort Evaluation Scale (Staff) 


Ineffectiveness 
aluation Scale 


A. Ineffective! 
(Fellow Pa 
1. Respect 
2. Leader 
3. Friend 
4. Overall Total 
B., Ineffectiveness E 
(Staff) 
Objectivity 
A. Objectivity Ọ sort (Patient- 521 
Observer) iehi 
B. Objectivity Evaluation Scale 
(Patient-Fellow Patient 
1. Respect 
2, der 
3: prani atal 
- Overa! tal J " 
C. Object ciy Evaluation Scale (Staf) 


tients) 


valuation Scale 


of “negative” si 
tion between 


te.—The direction 


50 that 
ibe > greater the 
ve correlator 
e lati ship and im 
tween relationship an pre 
signi sel, one-tailed t 
* r significant at the -05 lev el one-tailed, 


r significant at the 01 level 


TABLE 2 


PATIENT-THERAPIST RELATIONSHIPS (2) ESTABLISHED 
with A anD B 


A B 
Group I Group II Group IIT 

518 74 1.28 

-23 71 1.10 

AS -63 1.06 
-10 -62 1.00% 

-09 53 .99 

04 -24° „81° 

17 «73° 

„63° 

Mean .19 52 95 
SD AT .225 .214 

N 6 7 8 


a Moved out of city. 

b Failed to complete evaluation procedures. 

e Terminated prematurely, 

d Supplemented group therapy with simultaneous individual 


therapy. 


from zero.2° The findings indicate that the 
more closely the therapeutic relationship ap- 
proximated the ideal relationship the greater 
the increase in the patient’s Objectivity (as 
evaluated by the staff, 7 = .67, p < .01); the 
greater the increase in group Effectiveness— 
Leadership (as derived from ratings by fel- 
low patients, ř = .61, p< .05); and the 
greater the relief from symptomatic Discom- 
fort (as reported by the patient, 7 = .67, p < 
.01). It is noted that the correlation between 
therapeutic relationship scores and Objectiv- 
ity as measured by the Q sort falls just short 
of reaching an acceptable level of significance 
(F = .52, while an r of .53 is required for p 
< .05). 

These findings indicate that the quality of 
the therapeutic relationship does vary on cer- 
tain measures with patient-change in the 
areas of Objectivity, Effectiveness, and Com- 
fort when the therapist variable is controlled. 

That an association exists between the 
quality of the therapeutic relationship and 
the incidence of drop-outs is strongly sug- 
gested (see Table 2). When the eight pa- 
tients initially assigned to B’s group were 
ranked according to the quality of the thera- 
peutic relationship established, it was found 


10 Since the direction of the correlation was pre- 
dicted, a one-tailed test was applied. 
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TABLE 3 


MEAN DIFFERENCE IN OVERALL 
GROUP RELATIONSHIPS 


Mean 
2 
Differ- 

Comparison ence df t p 
Group I vs. II 33 11 2.94 .02 
Group I vs. IIT 76 12 7.17 .001 
Group II vs. III A3 13 3.80 OL 


that those patients who had remained in 
treatment occupied ranks 1 through 5. Those 
who terminated prematurely were in rank or- 
der positions 6 through 8. In investigating 
the rank order position of the single indi- 
vidual who dropped out of one of A’s groups, 
it was found that of a group of seven the 
terminator had occupied position Number 6. 
Moreover, the patient in rank order position 
Number 7 was found to have supplemented 
his group therapy experience with intensive 
individual psychotherapy without having noti- 
fied the group therapist. Thus, the five who 
dropped out or found it necessary to supple- 
ment group treatment appear to have had 
the poorer relationships when compared to 
the other members of the particular group to 
which they were assigned. An inspection of 
the group in which no member dropped out 
showed that the relationships were quite uni- 
form and, incidently, very low. The variance 
within this group tended to be smaller than 
in the other groups. The mean relationship in 
this group (Group I) was the lowest of the 
three groups (see Table 3). Thus the group 
having the poorest relationships remained in- 
tact while the group having the highest rela- 
tionships lost 38% of its members. From 
Table 2 it appears that the quality of the 
therapeutic relationship is related to prema- 
ture termination; however, the absolute size 
of the relationship score appears to be less 
important than the terminator’s relationship 
score relative to those of other members of 
his group. 

In designing the study it was decided to use 
observers to describe the relationships since it 
was assumed that patients’ perceptions of 
their relationships with the therapist would 


be subject to the distorting influence of trans- 
ference. However, the fact that the patients’ 
premature termination of therapy appears to 
be related to the quality of relationships, as 
evaluated by observers, implies that the pa- 
tients experienced their relationships much as 
described by these judges. To investigate this 
apparent concordance, an attempt was made 
to determine the degree of agreement be- 
tween the observer’s and the patient’s de- 
scriptions of the relationships. At the end of 
the experimental period the 14 remaining pa- 
tients were asked to describe their relation- 
ship with the therapist by using the Fiedler 
Q sort. Various items were modified to clarify 
technical terms. Patients were assured that 
the data they furnished was confidential and 
would not be relayed to their therapists. Their 
descriptions were then correlated with the 
ideal to derive “relationship scores.” Each 
patient’s relationship score was then corre- 
lated with his mean relationship score as pro- 
vided by the observer. The rho correlation 
was found to be .79 (p < .01). This finding 
strongly suggests that in these groups pa- 
tients were quite objective in perceiving their 
relationships with their therapists, Transfer- 
ence distortions as measured here appear to 
have played a surprisingly small role. 

Although the focus of the paper has been 
on the effect of the patient-therapist relation- 
ship on the outcome of group therapy, it is 
recognized that the patient-patient relation- 
ships also play a significant role. In the sam- 
ple of patients studied it was found that those 
who established better relationships with their 
therapist reported a significantly greater in- 
clination to perceive other group members 
as being socially attractive (Ineffectiveness 
Evaluation Scale-Friendship) than those who 
formed poorer relationships with their thera- 
pist. 


Discussion 


The findings of this study appear to pro- 
vide limited support for the hypothesis that 
patient-change in psychotherapy is related to 
the quality of the therapeutic relationship es- 
tablished. Of a total of 14 change measures, 
3 revealed significant correlations with thera- 
peutic relationship scores. It is of interest 
that one measure of each criterion of im- 
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provement—Discomfort, Ineffectiveness, and 
Objectivity—showed this association. The 
data indicate that the better the patient- 
psychotherapist relationship, the greater the 
symptomatic relief experienced by the pa- 
tient, the more likely it was that fellow group 
members would describe the patient as having 
become more dominant (leader), and the 
greater the increase in Objectivity attributed 
to the patient by the research staff. Since the 
treatment period included in this study was 
arbitrarily limited to the initial 20 weeks, it 
is not clear whether the associations de- 
scribed here characterize only the early stages 
of psychotherapy. It is not known whether 
these correlations are maintained, diminished, 
or increased, or whether additional correla- 
tions will be found between other outcome 
measures and the therapeutic relationship 
scores in later periods of psychotherapy. It 
is possible that not all the behaviors meas- 
ured in this study are modifiable at the same 
rate. The probability that a significant cor- 
relation would be found with the therapeutic 
relationship scores was not necessarily equal 
for each of the 14 measures. However, the re- 
searcher did assume that each measure had 
face validity and that therefore the hypothe- 
Sis could reasonably be tested against each of 
these 14 measures.” 


11 Experience with the Objectivity Ease 
Scale (Patient-Fellow Patients) indicates that ki 
quired the raters of the four measures containe oo 
this scale to make rather complex judgments. e 


i z ient was to predict the 
Ostensible task for the patien Se 


rating he would receive from each p peach 
oe three measures: Respect, Leadership, and riena- 
ship. In order to evidence Objectivity he hac te is 
able to predict not only how he was in 
€ach of his fellow patients but also the = Ake 4 
each would be willing to assign to him. ae F 
frequently keenly aware that the scores wi ae 
assigns to others will also provide the nee ig: or 
with information about the rater. gs gs 
ratings are frequently intended primar ly as has 
munication to the examiner rather than = pes 
jective report of the raters’ experiences and feeling: 


i i be seen 
abo i me patients wish to c 
Pee ee T p herefore assign high 


as warm and friendly and t e 5 
Scores fairly indiscriminately to their pant ka 
tients, Others wish to be seen as aloof and indep 


ent of the group members an 
Scores, In a some patients appear reluctant to 


reveal warm feelings concerning group ae of 
€ opposite sex and therefore tend to minimize these 
ratings, 


Statistically significant changes for the 
group as a whole, over the 20-week period, 
occurred only on three measures: Symptom 
Disability Checklist, Discomfort Evaluation 
Scale (Staff), and Ineffectiveness Evaluation 
Scale (Staff). Only one of these change meas- 
ures was found to be significantly correlated 
with therapeutic relationship scores. Although 
it is possible that individual patients experi- 
enced real changes even if the overall sam- 
ple did not, it is equally possible that the 
change scores used in this study may repre- 
sent measurement error, particularly in those 
instances where a lack of correlation with 
therapeutic relationship scores was shown. In 
those instances where no significant change 
occurred, the possibility of measurement error 
must be considered a possible explanation for 
the lack of correlation between change scores 
and therapeutic relationship scores. 

The question arises as to the explanation 
of the finding that one measure of each cri- 
terion showed a significant association with 
the quality of the therapeutic relationship, but 
other measures of the same criterion failed to 
show similar associations. One condition un- 
der which such findings could have occurred 
is that each of the three hypothesis-support- 
ing measures was independent of the other 
measures of a given criterion. To the degree 
that the various measures are independent of 
one another they may be expected to vary in- 
dependently with the quality of the thera- 
peutic relationship. A second condition which 
might account for these findings is that the 
measures were related to each other in a com- 
plex fashion—i.e., two measures may be cor- 
related under some circumstances, and not 
correlated, or even negatively correlated, un- 
der other circumstances. To determine the de- 
gree and nature of the association between 
the three criterion-supporting measures and 
the other measures of the relevant criterion, 
the initial scores on these measures were in- 
tercorrelated. Thus scores on the Symptom 
Disability Checklist were correlated with the 
other two Discomfort measures, Objectivity 
Evaluation Scale (Staff) scores were corre- 
lated with the other five Objectivity meas- 
ures, and Leader scores on the Ineffectiveness 
Evaluation Scale were correlated with the 
other four Ineffectiveness measures. The con- 
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dition of independence of measures was found 
to be the case with the Objectivity group of 
measures. No significant correlations were 
found between the initial scores of the Ob- 
jectivity Evaluation Scale (Staff) and the re- 
maining Objectivity measures. Since the meas- 
ures apparently tap different aspects of the 
criterion, the fact that change on one meas- 
ure correlates with relationship scores does 
not lead to the expectation that change on 
other measures will also be associated with 
relationship scores. 

The possibility that some contamination of 
measures had occurred on the Objectivity 
Evaluation Scale (Staff) must be considered 
since in two instances one member of the 
three-man team of staff raters had also been 
the observer who described the therapeutic 
relationships. This fact need not cast serious 
doubt on the authenticity of the correlation 
between changes on Objectivity Evaluation 
Scale (Staff) and quality of therapeutic re- 
lationship for the following three reasons: 
(a) Staff ratings were determined by confer- 
ences of three staff members. There is no 
evidence that the opinion of any one team 
member was weighted disproportionately. (b) 
Even under the most unfavorable circum- 
stances only 2 of the 14 cases could have 
been affected by the possible contamination 
of measures. (c) If the staff ratings and the 
judgments regarding therapeutic relationships 
had been spuriously correlated due to con- 
tamination of measures, the same association 
would also be anticipated with the other two 
staff ratings on Ineffectiveness and Discom- 
fort. No such evidence is found. 

The condition of “complexity of associa- 
tion” between measures appears to apply to 
the cases of Ineffectiveness and Objectivity 
measures. 

In the case of Leadership, Ineffectiveness 
Evaluation Scale (Fellow Patients), it was 
found that it failed to correlate significantly 
with any other measure of Ineffectiveness with 
the exception of the Overall Total score. This 
is to be expected since the Overall Total 
measure is simply the sum of Leadership, 
Respect, and Friendship scores. Leadership 
scores, however, were not found to correlate 
with either Respect or Friendship. Appar- 
ently a patient’s dominant group behavior 


did not win him the respect or friendship of 
his peers. Since the Overall Total Evaluation 
Scale score is made up of components which 
are independent of each other, it is not sur- 
prising that changes on this measure do not 
correlate with therapeutic relationship scores, 
despite the fact that Leadership change scores 
do show a significant association with thera- 
peutic relationship scores. 

The Symptom Disability Checklist, a meas- 
ure of Discomfort, was found to correlate sig- 
nificantly (.55) with Self-Satisfaction Q sort. 
The expectation that changes in Self-Satisfac- 
tion Q sort scores might, like the Symptom 
Disability Checklist change scores, correlate 
with therapeutic relationship scores was not 
supported. An analysis of the Self-Satisfac- 
tion Q sort data revealed that two quite op- 
posite changes had occurred. Those patients 
who initially reported very high self-satisfac- 
tion appeared to become less content with 
themselves, while those who initially showed 
the greatest discontent tended to show greater 
self-satisfaction as therapy progressed. This is 
not a simple regression toward the mean for 
it was found that the patients who initially 
indicated that their group behavior very 
closely approximated their ideal behavior 
were the ones who showed rather poor Ob- 
jectivity. As these individuals became increas- 
ingly aware of their actual behavior, this was 
reflected in their description of their group 
behavior as being less in accord with their 
rather stable ideals. As a result of this shift 
the initial correlation between self-satisfac- 
tion and symptomatic comfort was dissipated. 

The fact that a positive relationship was 
found between the quality of therapeutic re- 
lationship and three measures does not, of 
course, permit one to assign direction to this 
association. In the case of Leadership Ineffec- 
tiveness Evaluation Scale, for example, two 
equally plausible interpretations can be made: 
patients who established the better relation- 
ships with their therapists were able to be- 
come more assertive and dominant in their 
therapy groups; or, the therapist tended to 
relate to those patients who manifested in- 
creasing evidences of leadership and domi- 
nance in the group. Despite the fact that the 
findings are consistent with the general hy- 
pothesis that improvement follows upon the 
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establishment of a good relationship, other 
alternative interpretations must be consid- 
ered. The possibility that the relationship es- 
tablished may be associated with character- 
istics of the patient is not ruled out by the 
absence of a significant correlation between 
initial scores on the 14 measures used in this 
study and the quality of the subsequent thera- 
peutic relationship. On the contrary, there is 
compelling evidence that the therapist's per- 
ception of his patients is intimately associ- 
ated with the quality of the relationship he is 
able to establish with them (Parloff, 1956). 
Such correlation was evidenced not only in 
the initial 4 weeks of therapy, but through- 
out the 20-week experimental period (Parloff, 
1953), 

It may be possible, for example, that one 
of the characteristics of the expert group 
therapist is his ability to identify the thera- 
peutic potential of his patients. As a conse- 
quence, he may then direct his attention = 
ward effecting a positive relationship wit 
those patients with whom he feels he can be 
most useful. Indeed. he may recognize and 
attempt to relate to those individuals who 
tend to improve seemingly independent of 
specific therapeutic efforts. , 

Although the findings support the notion 
that Fiedler’s instrument has a measure of 
construct validity, the @ sort may offer only 
a partial or even a superficial definition of the 
therapeutic relationship. The ideal aa 
tic relationship standard defined by Fiedler 
seems to describe conditions essential to any 
Meaningful social relationship, pee of 
therapeutic intent. This type of ee 
may be therapeutic per se; it also is lg e 
that this relationship may be uni y “ 
Prerequisite condition for the ny “ae 
of an as yet undefined and unmeasure¢e thera- 
Peutic relationship. Such a relationship may 
involve the utilization of specialized tech- 
niques and procedures which the therapist 
may regard as essential for treatment, e8., 
analysis of transference, free association, 
dream interpretation, reliving of earlier emo- 
tional experiences, etc. One interpretation of 
the finding that patients who establish the 
better relationships with their therapists tend 
to perceive others as being more socially de- 
sirable may be that patients in a group take 


their cue in relating to each other from the 
quality of the therapist’s relationships with 
them. A more parsimonious explanation is 
that both therapist and patients react simi- 
larly to a given situation. 


SUMMARY 


This study reports an attempt to determine 
whether an association exists between the con- 
struct “therapeutic relationship” and outcome 
of treatment in a group therapy setting. The 
quality of the therapeutic relationship was 
measured by Fiedler’s ideal therapeutic rela- 
tionship Q sort. Three criteria of improve- 
ment were used: Comfort, Effectiveness, and 
Objectivity. These criteria were measured by 
14 scales. In addition, a study was made of 
the therapist-patient relationships established 
with patients who terminated therapy pre- 
maturely. 

The sample included 21 patients, 13 of 
whom were treated by one group therapist 
and 8 by another. The experimental treat- 
ment period was limited to 20 weeks, at 
which time outcome data were available on 
14 patients. 

Patients who established better relation- 
ships with their therapist tended to show 
greater improvement than those whose rela- 
tionships with the same therapist were not as 
good. In computing the overall pooled mean 
correlations between the therapeutic relation- 
ship scores and change measures for 14 pa- 
tients, significant correlations were found on 
three measures: increased Objectivity (staff 
evaluation), increased Effectiveness (group 
leadership), and increased self-ratings of 
Comfort (symptomatic relief). 

Premature termination of therapy by a pa- 
tient appears to be related to his perception 
of the “goodness” of the relationship he has 
established with his therapist relative to the 
general level of patient-therapist relationships 
within his group. Individuals having the 
poorer relationships in a group tended to 
drop out of therapy irrespective of the ab- 
solute goodness of their therapautic relation- 
ship. 

The hypotheses postulating that benefit 
from psychotherapy and incidence of pre- 
mature termination were associated with the 
goodness of the individual patient-therapist 
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relationship tended to be supported. These 
findings give limited support to the validity 
of the concept “therapeutic relationship” as 
defined by Fiedler. 
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A FACTOR ANALYSIS OF GERIATRIC ATTITUDES 


WILSON H. GUERTIN? 


University of Florida 


Despite current expressions of interest in 
geriatric patients, there is a surprising lack 
of attempts to evaluate the attitudes of the 
aged objectively. The only specific instrument 
available is the Activities and Attitudes Sur- 
vey of Caven, Burgess, Havighurst, and Gold- 
hamer (1949). Nor have there been factor 
analytic attempts to define the prominent di- 
mensions underlying the expressed attitudes 
of these people. The present paper 1S a = 
port of a factor analysis of such attitudes 
and serves as a basis for the oo 
of ,a forced-choice Geriatric Attitude Scale 


(Guertin & Krugman, 1961). 


PROCEDURE 


A total of 166 “agree-disagree” ee he ot 
posed to provide a wide sampling of the a ogee 
institutionalized aged. Attitudes ge i P or 
lems of adjustment were | emphasized, sce 
Psychiatric and pisan mig 8 oes i, 

-ei male residents ii es 
tha uae at Martinsburg, wl ras 
satisfactorily completed all the items. ae ae 
were all 60 or more years old, and none 


chiatric diagnoses. 


After excluding 16 items because responses to them 


Were too uniformly in one ote ee wae 
150 items were subjected to 2 Se gra for a four- 
sis. Key sort cards provided freque ible pairing of 
fold contingency table for pac D choric correla- 
items. A link was noted when the Ga items. These 
tion was greater than .40 between A anche were 
linkages and the direction of Pora on the floor 
transferred to slips which were hi atrix. Cluster 
in the form of an inter Oe half or more 
items were identified by observing as column were 
of the item linkages (row entries) 

the same for a pair of test items. 


ed with the author 


"n t 
_ 7 Arnold D. Krugman ere the data used 


In devising the items and Gunes while the author 


. ni 
herein. The mar E aes Administration Hos- 


was employed at the 
Pital, Knoxville, Iowa. 
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RESULTS 


Eight important clusters were identifiable 
but the complexity of interrelations made it 
clear that only factor analysis could clarify 
the underlying structure. Therefore, 27 promi- 
nent cluster items were selected to provide a 
matrix of tetrachoric interrelations. Multiple- 
group factor analysis and blind rotation to 
oblique simple structure by the single plane 
method produced the factor matrix in Table 1. 
Items employed in the factor analysis are 
identified by asterisks, but the calculation of 
additional item loadings was necessary to 
provide the content for an understanding of 
the nature of the obtained factors. 

Items in Table 1 without asterisks are those 
not originally factor analyzed, but for which 
loadings were calculated by extending the 
factor matrix (Cattell, 1952, p. 406). Only 
those items with heaviest loadings are re- 
ported here. Item descriptions are in abbrevi- 
ated form and decimal points have been 
omitted for convenience in presentation. 

It may be of interest to follow the fate of 
the eight clusters of variables. Three defined 
three of the multiple-group factors with a 
single variable from a fourth cluster pulled 
into one of the factors. Another cluster split 
to form two factors. The three remaining 
clusters failed to contribute uniquely to the 
factor structure. 

Conventionally, factors obtained from the 
multiple-group factor analysis are rotated to 
orthogonal positions to make it possible to 
calculate communality and residuals. The 
orthogonal matrix then is rerotated blindly to 
oblique simple structure. Since the first ma- 
trix obtained in factoring often approximates 
the final oblique simple structure solution, it 
is of some interest to compare the two ma- 
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trices. The following are five items in the 
original factor analysis, selected from Table 1 
as having the single highest loading in each 
of the factors. The values in the first column 
of the pair are for the multiple-group load- 
ings while the second are for loadings from 
the final rotated matrix. 


Wilson H. Guertin 


Basic and prominent general attitudes un- 
derlie the original 27 x 27 intercorrelation 
matrix as testified by the very high 72% of 
the total variance accounted for by the five 
factors. The total estimated communality was 
19.12, which was completely accounted for 
by the five factors. Communalities, calculated 


A A’ B B’ ce pp E E š 3 ; 
9L Bt 10 12 3e o7 57 26 30 43 from orthogonal loadings, are listed in Table 
22 46 8&8 77 2 16 62 -1 . í 
11 20 -10 ~10 we a ae Be 30 -ot 1. Intercorrelation between factors is gener- 
38 35 06 —35 —27 03 84 82 10 —01 : A 
46 16 —13 —38 04 17 18 10 s9 91 ally high as seen in Table 2. 
TABLE 1 
ROTATED OBLIQUE FACTOR LOADINGS 
Factors 
Anx.- Pres. Phys. 
Item Description Dysph Alien. Int. Compl. Incap. Fai? 
Always fearful* 81 12 07 26 43 ei 
Quite unhappy 81 45 31 40 19 72 
Need much sleep* 80 —05 10 32 —08 81 
Sick frequently* 79 11 27 18 22 75 
Feel unloyed* 78 17 02 49 49 86 
Restless sleep 78 39 —01 28 36 17 
Lonely 77 cait 26 22 19 86 
Can’t keep mind on things 76 —10 26 60 36 82 
Unwanted q5 17 —08 24 60 99 
Best time of life is now 74 12 22 45 28 60 
Worry too much* 69 05 27 67 14 66 
Have high blood pressure* 67 00 57 54 44 74 
If there’s a Heaven PII go* 67 
Best time of life is wh ild* o 5 = =e A 
ife is when child 45 12 39 —04 —06 44 
Don’t care what happens to me* 4 g 
Family doesn’t care about me* 2 n o8 g o F 
l 74 —08 22 45 78 
No self-respect left* 36 73 z 94 
Don’t care to see relative: ut 55 of 
t s 06 1 91 
Aee any 7 16 49 24 
Nothing left to live for’ 15 70 32 4 2 87 
Don’t believe in God* 07 66 03 A F 83 
Nothing interests me anymore 39 66 ao ra 
> 14 60 14 75 
Don’t eat enough 27 62 —04 5 1 16 
Don’t think will live a year 28 59 30 2 E 70 
Don’t like t isiting* za 
on’t like to go visiting 08 29 ris —20 08 18 
Have lots of friends pa] 03 9. n 
Very religious* 20 —10 3 R E E 
Money root of evil 56 37 78 37 31 on 
Wish had more freedom* 00 —28 75 57 52 90 
Wars root of all trouble 69 08 75 42 45 92 
Mentally younger than appearance 06 16 66 14 —02 55 
Older cast off by younger 66 04 63 32 07 76 
Trouble walking* 16 —22 60 52 59 68 
Life has been tragic 27 —10 60 39 45 50 
Wish had better clothes* 21 —43 59 20 06 57 
People inconsiderate of others* 41 19 59 —01 —13 63 
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TABLE 1— (Continued) 


Factors 

Anx.- Pres Phys. ii 7 
Item Description Dysph. Alien. Int Compl. Incap. Dai? 

No sex interests 37 33 90 07 95 
Nothing left to live for 52 50 84 32 96 
Bad headaches frequent* 35 2 82 =01 106 
Fair to retire people at 65 Ki z a0 E = 
Stomach trouble 2 3 8 1s 68 
Worry about health* Las a is 02 77 
Dizzy spells 49 s 9 = 67 
Feel helpless Gs T re ae 96 
Often very unhappy 2 33 Mes a 13 
Wish had better education * z i 13 43 
Part of body paralyzed* 16 ee nd 1 pi 108 
Way of living very unnatural® ps -u 3 a (2 us 
Women ruined my life 15 69 24 H Lite 74 
Drs. & nurses don’t care about us A 07 Ee er a 111 
Have not lived good life* 65 E 13 19 64 Fit 
Get annoyed easily 48 47 —01 30 60 69 
Future holds nothing* 32 32 08 18 57 45 
Young can’t be bothered with old 3 08 25 09 55 36 
People make own troubles* 06 02 >24 —05 5 17 


Death a relief from suficring* 


Dai? = 19.42 


Note,—Items employed in the factor 
TABLE 2 
EE JE 
INTERCORRELATIONS BETWEEN OBLIQU 
FACTORS 
Pen s. Phys. 
res 
bveih, Alien. RoS Compl Incap. 
27 -46 «13 
Anxiety-Dysphoria ao 0s | 08 “00 
nation 40 A18 
ved Interest Jl 
I al Complaints 
Neapacitation 
Discussion 


factor combines the 
d being unwanted. 
and social situa- 


The Anxiety-Dysphoria 
feelings of fear, tension, F 
Preoccupations with healt 
tion ie personal instability ang general a 
easiness, Since the manifestations are argely 
Subjective, a superficially satisfactory ned 
ment is not precluded by their premen 
ever, the underlying lack of self-con dence 
and general insecurity represent a deficiency 
in an attribute necessary for flexible adapta- 


tion to environmental change. 


analysis are identified by asterisks. 


Decimal points omitted. 


The Alienation factor demonstrates hostil- 
ity and dysphoria as a reaction to feeling re- 
jected. It is a disgruntled reaction reflecting 
ambivalence toward dependency. While love 
and help are desired strongly, these needs are 
vehemently denied. Hostility, which drives 
others away, serves as a defense against suc- 
corance by them. These attitudes probably 
find ready expression in response to efforts of 
others to establish independence in the aged. 

The Preserved Interest factor represents a 
relatively high level of interest in self and en- 
vironment. While there may be some narrow- 
ing of interests with aging, and certainly re- 
duced activity, the factor reflects strength 
and scope of interest as a resource. This char- 
acteristic permits a high level of social ac- 
tivity which may lead to the rewards of be- 
ing well liked. Triteness and superficiality en- 
ter into determining this factor so that while 
possession of the characteristics of this factor 
may be essential to being interesting and well 
liked, garrulousness may have an 
effect. 


adverse 
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The Physical Complaints factor represents 
a focusing of interest on the self in terms of 
body functions. Since there is no control for 
physiologically based illness built into the 
study, we must assume a variety of rea- 
sons for the complaints. They may be based 
upon systemic malfunctioning and anatomical 
changes associated with aging, chronic or 
acute disease, or may represent a hypochon- 
driacal exaggeration. 

The Incapacitation factor is based upon 
crippling physical disease and the reaction to 
it. The significance of some of the heavily 
loaded items is not apparent and may be 
rather specific for the sample of subjects em- 
ployed. However, it is a sizeable factor and 
cannot be disregarded. 


SUMMARY 


Geriatric attitudes of 48 elderly residents 
of a veterans administration center were sam- 
pled. Analysis revealed five important atti- 
tudinal factors: Anxiety-Dysphoria, Aliena- 
tion, Preserved Interest, Physical Complaints, 
and Incapacitation. 
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CLINICAL JUDGMENTS AND THE DRAW-A-PERSON TEST 


ROBERT E. STOLTZ axb FRANCES C. COLTHARP 
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The Draw-A-Person Test (DAP), as de- 
veloped by Machover (1949) and Cod 
enough (1926), has become an imponan 
part of the clinical psychologist s battery o 
assessment techniques. While widely used to 
provide information regarding the intellectual 
functioning and emotional and social keha 
ior of a person, there is a paucity o ae 
quate data regarding its empirical validi 7 
Swenson (1957), in an extensive ale A 
the literature regarding human, fgme con 
ings, found modest confirmation © oer 
of Machover’s hypotheses regarding a p 
trends, but little evidence of the value ope 
DAP for individual diagnosis. The ea ice 
would also indicate that the validity © T 
method for determining the level of ape 
son’s intellectual functioning 1s better € sa 
lished than is its validity for prelete oe 
more complex and less well define Na we" 
of social and emotional oe = stat 
tional deficiency in the existing F ong a 
these studies which have attempte Odie 
mine the validity of the DAP for m S 
social behavior have tended to we > Da 
judgments of clinicians pes Edt nie 
gating the problem of the m Saale. 
cian’s contribution to the resu'l 
tion, e.g., Tolor and Tolor 

This study represents an @ 
vide additional data regarding ! be toe 
the DAP with regard to predic a Da 
tual, social, and emotional enea E 
to provide data regarding the a>" dictions. 
vidual clinicians to make such pre 


METHOD 
de school children, each 


Subjects were 60 fourth gra a person of each 


drawing of 


of whom furnished a ocedures. 
€ ual DAP pr 
Toename aocordinE to aial “psychologists, each of 


of the DAP and 
tent. The judges 


he judges were thre 
whom was experience! 
regarded as professiona 


d in the use 
lly compe! 


were asked to rate the drawings for the traits of 
Intelligence, Sociability, and Emotional Maturity. 
The ratings were done using a nine-point scale. The 
only information available to the judges was the age 
and sex of the child, which of the two drawings was 
completed first, and a detailed description of the test- 
ing procedure and criterion definitions for the three 
traits. 

The criterion for Intelligence was the child’s score 
on the Otis Quick-Scoring Mental Ability Test (Beta, 
Form EM), the criterion for Sociability was the 
preference rating given to the child by his fellow 
students on a sociogram, and the Emotional Ma- 
turity of the student was judged from a teacher’s 
rating in which each teacher nominated the five most 
well-adjusted and the five most seriously emotion- 
ally disturbed boys and girls in her room. The 
teacher nomination form developed by Smith (1958) 
was used. On the basis of this technique the subjects 
were divided into groups of poor, average, and above 
average adjustment. 


RESULTS 


Distributions of the ratings for each of the 
traits by each of the judges were examined 
and found to be generally normally dis- 
tributed, as were the criterion scores. The as- 
sumptions for computing the Pearson product- 
moment correlation were met, and this method 
of statistical analysis was used.? As the cri- 


1 All statistical computations were done with the 
assistance of the Southern Methodist University 
Computing Laboratory on the Univac 1103. 


TABLE 1 


CORRELATIONS BETWEEN JUDGES RATINGS AND 
INDEPENDENT MEASUREMENTS 


, Emotional 
Intelligence Sociability Adjustment 
A .253* -158 —.t4: 
Tadge: B 496% :006 ati 
Judge C 2583** -179 -165 
Average for 
Judges A, B, C 455 -115 1.53 
*p <.05. 
** p <01 
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TABLE 2 


CORRELATIONS BETWEEN JUDGMENTS 
AND JUDGES 


Judge 

Traits A B € 
Intelligence-Sociability 176 Pe aad 
Intelligen Adjustment .381** = .447** 
Sociabilit: motional Adjustment -082 ste 
Average Intercorrelations -220 .615%* 


FERS Ol, 


terion for emotional adjustment was essen- 
tially trichotomous, all correlations regarding 
this trait were corrected for coarseness of 
grouping. 

Table 1 gives the correlations between the 
judges’ ratings and the criterion for each of 
the three traits, as well as the average judge 
intercorrelations for each trait.? Table 2 gives 
the intercorrelations between the ratings for 
the three traits by each of the judges, as well 
as the average trait intercorrelation for each 
judge. Table 3 gives the intercorrelations be- 
tween the three judges for each of the three 
traits as well as the average intercorrelations 
of the judges for each of the traits. 


DISCUSSION 


As is evident in Table 1, the three judges 
were able to predict the intelligence test per- 
formance of the subjects to a significant de- 
gree. However, there is a considerable differ- 
ence among the judges in their ability to 
predict this criterion, For example, a test of 
the significance of a difference between cor- 
relation coefficients indicated that Judge C 


* All correlations were converted to Fisher s co- 


efficients for averaging and then converted back to 
correlation coefficients. 
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is significantly better able to predict this trait 
than is Judge A. None of the judges is able 
to predict the Sociability criterion or the Emo- 
tional Adjustment criterion to a significant 
degree. Even more distressing is Judge A’s 
prediction of Emotional Adjustment which is 
negatively correlated with the criterion. Ef- 
forts to obtain multiple correlations between 
optimally combined judges’ ratings and the 
trait criteria resulted in multiple Rs of .614, 
.230, and .293 for Intelligence, Sociability, 
and Emotional Adjustment, respectively. We 
must conclude that even optimum weighting 
of the clinician’s judgments does not produce 
significant prediction of the latter two criteria. 

In the design of the study an effort was 
made to choose criterion areas which would 
be distinguishable from each other, and whose 
specific criterion scores would tend to be in- 
dependent of each other. The intercorrelations 
of the criteria furnish some data bearing on 
the extent to which this effort was successful. 
The correlation between the intelligence scores 
and the sociogram scores was .414; between 
intelligence scores and the teacher ratings on 
Emotional Adjustment, .495; and between the 
sociogram scores and the teachers’ ratings of 
Emotional Adjustment, .458. Each of these 
correlations is significant beyond the .01 level. 
It can be concluded that the criteria were not 
completely independent, but were only mod- 
erately so. 

From Table 1 it can be concluded that 
Judge C is the best judge of the criteria, while 
Judge A seems to be the poorest overall. This 
difference in the ability to predict the cri- 
teria seems to be due to the joint influence 
of a judge’s ability to predict a subject’s in- 
telligence test performance and to the extent 


TABLE 3 
CORRELATIONS BETWEEN JUDGES 
Average 
: Emotional Correlation 
Judges Intelligence Sociability Adjustment for Judges 
A-B .579** 243 307* 7 390 
A-C .555** 321* .349** 445 
BC :702* 495" «5047 1657 
Average on 
Variables i615** .350** .390** 
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to which a given judge implicitly viewed in- 
telligence as a trait related to Sociability and 
Emotional Adjustment. As indicated above, 
both the Sociability and Emotional Adjust- 
ment criteria were correlated with the Intelli- 
gence criterion. Partialing out the effect of 
the Intelligence criterion reduced the correla- 
tion between Sociability and Emotional Ad- 
justment to .320, indicating a fairly large con- 
tribution to the zero order correlation be- 
tween these two criteria. Table 2 shows = 
clearly that Judge C produced trait pei 
that were highly correlated with each oi 5 
while Judge A’s trait ratings showed a oe: ñ- 
cant correlation only in the case ot the In- 
telligence-Emotional Adjustment ms TE 
the Intelligence criterion contributes atang 
to the other criteria, and since Judge Cis a 
only the best predictor of this cintarion, Dis 
also the judge who shows the stronges ope 
ency to use ratings on this dimense 2 e 
element in predicting the other crite i a 
would follow that Judge C would appe bt 
be the most accurate judge overall. > fade 
verse of this argument would hold = J ae 
A, and the same argument ould PERS a 
appearance of Judge B as the e aie ae 
intermediate degree of overall pre ie an oF 
ity. Tt would appear that the casio) aa 
overall predictive accuracy — a 
given clinician is, in this case a deso 
„function of the clinician’s ability i DAP 
a valid index of Intelligence from 
and little more. p 
ea cnet the extent e T 
judges agree with each other In" h n T 
of the three traits. As would be expe! 


j show their best 
the other tables, the judges predicting In- 


agreement when it comes gree- 
telligence from the DAP, and mieh Tie a 
ment when it comes tO eae between 
two criteria. The higher ape ined by the 
Judges B and C can be exp ca e 


7 judges appear to 
greater extent to nor in their trait 


be reacting to a generat" Soni these gee 
definitions, and would indicate dang 
eral factors are similar for the “ee ae 
The present study was not designe be TE NEY 
be simply a liking tOr 597 ‘ 
diere a mime on certain eomma cio 
ences in the literature, & Carry” aa ae 
eir general impression of the per: 


justment, or it may be some halo effect de- 
rived from a source in the drawings as yet 
unknown. 

The findings of this study would not, in 
general, support the use of the DAP as a 
measure for predicting behavior criteria in 
the area of Sociability or Emotional Adjust- 
ment. The findings would lend support to the 
use of the DAP as a measure of intellectual 
functioning, but would also support the earlier 
reviews which suggest that the relationships 
are not adequate for individual prediction. 


SUMMARY 


The present study was designed to provide 
additional data regarding the ability of clini- 
cal psychologists to predict criteria of intelli- 
gence, sociability, and emotional adjustment 
from human figure drawings. The subjects 
were 60 fourth grade school children who 
were given the Draw-A-Person Test in a 
group situation. Three clinical psychologists 
judged the extent to which each child’s draw- 
ings indicate the existence of one of the three 
trait criteria. The relations of the clinical 
judgments to the criteria were statistically 
compared. 

The psychologists were able to predict in- 
telligence to a statistically significant degree, 
but were unable to predict either sociability 
or emotional adjustment. Although working 
independently, the judges did show a signifi- 
cant amount of correlation with each other 
in their predictions. 

Factors influencing the ability of the judges 
to produce ratings that would correlate with 
the criteria are discussed. 
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The rate of occurrence of a characteristic in 
a specified population has come to be called 
the base rate of the characteristic. Meehl and 
Rosen (1955) have discussed the importance 
of considering base rates in evaluating a pre- 
dictive system. They point out that “a psy- 
chometric device, to be efficient, must make 
possible a greater number of correct decisions 
than could be made in terms of the base rates 
alone”? (p. 194). An illustration used by 
these authors was the prediction of juvenile 
delinquency by the Gluecks (Glueck & 
Glueck, 1950). In their example where the 
base rate concept had been ignored, the data 
were in effect treated as though the base rate 
were 50%, which is highly unlikely. The 
present authors have cross-validated the same 
predictors and the conclusions drawn by Meehl 
and Rosen were born out (Wirt & Briggs, 
1960). 

Further examples of the importance of the 
base rate can be found with ease. An inter- 
esting recent article by Schofield and Balian 
(1959) compares the incidence of psychic 
trauma among normal and schizophrenic pa- 
tients. Their results indicate that the base 
rate for trauma is so high that it is obviously 
not peculiar to their schizophrenic sample. 
This study followed the form suggested by 
Pearson and Kley (1957) who remark: 


Eventually, like it or no, we will have to come to 
grips with the high probability that the base rate 


1 This research was supported in part by Grant 
1151c from the National Institute of Mental Health, 
Public Health Service, United States Department of 
Health, Education and Welfare; and in part by the 
Graduate School of the University of Minnesota. 
The authors wish to express their appreciation to 
the federal government and to their university. 

2There is one minor exception to this rule, viz., 
when the valid positive rate equals the valid nega- 
tive rate, accuracy of prediction is independent of 


the base rates. 
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problem applies in the prediction of mental disorder 
from kind and number of traumatic life experiences, 
just as it applies in the case of psychometric predic- 
tion (p. 407). 


Pearson and Kley go on to lament with others 
the fact that behavioral scientists do not tend 
to consider base rates in their study of case 
materials. : 

The present study suggests that prediction 
in the area of delinquency is not an alto- 
gether lost cause at this time if one’s goals 
are reasonable or moderate and one tends to 
remain in the relatively narrow context 1 
which the criterion data were gathered. In 
this connection Pearson and Kley (1957) 
suggest 


. . . that individuals in a population with a know? 
and relatively high incidence rate for a particular 
disorder may be submitted to longitudinal investi- 
gation of a kind. which would not be economical for 
samples drawn from the general population (p. 400). 


Although their argument was aimed at the dis- 
covery of etiological factors, one may als0 
examine efficiency of a treatment program 
through the study of a highly concentrate 
sample of cases among whom pathology may 
be expected to occur. 

The approach used here was the multiple 
criteria technique discussed by Meehl a? 
Rosen (1955). The first criterion was the 
Minnesota Multiphasic Personality Inventory 
(MMPI); cases were selected whose MMP. 
profile had certain scale elevations that wet 
known to be related to delinquency (Hath 
away & Monachesi, 1951). The second ct” 
terion was selection within this group usi”& 
family history data developed in anothet 
study by the authors. It was shown in tb® 
earlier study that a great number of family 
history factors, especially, those commo” 
recognized as tragic, can be related to 4°” 
linquency (Wirt & Briggs, 1959). Thus wit 
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a well recognized personality technique and 
one of a number of possible indicators of 
family disorder, it was possible to obtain the 
results demonstrated below. 


METHOD 

Samples 

escribed sample of nearly two 
thousand boys tested by Hathaway and Monachesi 
(1951) during the school year 1947-48, a sample of 
573 cases stratified on MMPI code and delinquency 
Was drawn from a ee gee 
subj a tested in í 
ge pad thirteen years old at cies 
Delinquency ratings were based on the pe ee 
lowing testing, so that in this study the sy eee 
of delinquency refers to the prediction of 3 

quent phenomenon. 


From a previously d 


of 1,958 cases. These 
he ninth grade and 


1 cases 
Sa remaining group was 
IMPI codes did not in- 
There were 372 cases 
lation. The ex- 
Pd-Sc, Pd-Ma, 


and this subsample of 
cases in the population. 
composed of boys whose M 
clude the excitor combinations. 


Sc-Pd, Sc-Ma, Ma-Pd, and Ma 


& Monachesi, 1951, an i a 
§ si, 1951, a the code 
Justification of these procedures) Pay 
numbers for these scales are * 7" | “4go” 
9= Ma, the excitor code sample is called the 
group. e d some con- 

Most studies of delinquents pave secon omitting 
tact with the police as a defining cri Delinquent se- 
the question of severity altogether. ality dimension 
Verity is not a homogeneous paea or judgmental 
ut probably reflects a sociolog = erceiver of the 
dimension of the society or of ti “the definition of 
delinquent (Wirt & Briggs, o y was a rating 
Clinquency adhered to in ket ds. A di- 
based. upon court and police docket E severe cri- 
chotomous split was made: w any contact with 
terion including all cases who ha iterion excluding 

e police and (b) a more severe ore involved only 
Persons whose contacts with the p° 


minor infractions. 


this stud: 


Family History Factors 


The source from which the 
Was a survey of the records 
(Voluntary and governmental ). 
all boys and their families W ee 

© admission files of the eed 
Pletely reviewed for data whic ae 
oly important. The data were 
“Rs ex oportions are not equi- 
tually include subsam- 
ly to produce a 


se data were derived 
of 11 social agencies 
The case records of 


3 The sampl Jation pr 
ple/popu 
table beca les ac 
use the samp! 7, & 
Ples which were weighted differential, 
Valid estimate of population values. 


form and were found to include 42 fairly common 
but discrete items. These items were grouped into 
seven more general categories: family disruption, 
poverty or need, dissocial behavior, psychiatry for 
family, marital disruption, inadequate parent-child 
relationship, and minor psychological problems. The 
present study focused on data from one of these 
categories, “family disruptions due to disease,” which 
included six items: mother dies, father dies, mother 
chronically ill, father chronically ill, siblings die, or 
siblings are chronically ill. A category score could 
be determined by assigning one point for each item 
present in the family history. Thus a score of 1 
point meant that at least one of the items was true 
for a given family, 2 points meant that two items 
were true or that one item occurred twice, etc. 

It is to be understood that these items of infor- 
mation were recorded in the social agency records 
before the subjects had been delinquent and before 
they were tested by Hathaway and Monachesi in 
1948. The delinquency that is referred to occurred 
after testing and thus also after any particular dis- 
ruptions of the family due to disease. 


RESULTS 


The data are presented for two degrees of 
delinquency. The less severe criterion, which 
nets of course the largest percentage of delin- 
quents, shows the estimated overall popula- 
tion rate of delinquency to be 41%. Among 
cases with elevated excitor (i.e., 489) codes, 
the proportion of delinquents was approxi- 
mately 43°. Using the slightly more severe 
criterion of delinquency in which minor of- 
fenses were excluded from the delinquent 
sample, the overall population rate was 32% 
delinquent, while the rate for the 489s was 
35% delinquent. 

Among cases with instances of family dis- 
ruption due to disease, the rate of delinquency 
for all codes tended to increase as the num- 
ber of family disruptions due to disease in- 
creased. That is, accuracy of delinquency 
prediction improved for cases known to have 
family disruptions due to disease regardless 
of MMPI patterns of the subjects. By calling 
the whole population delinquent the accuracy 
of prediction would be only 32%; by calling 
cases with a score of 1 or more points on 
family disruption due to disease delinquent, 
accuracy of prediction would rise to 43%, 
at a score of 2 or more points accuracy would 
reach 53%, and accuracy of such prediction 
increases to a maximum of 63%. Thus if one 
is interested in selecting potential delinquents 
for treatment, knowledge of social agency con- 
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TABLE 1 
PROPORTION oF Boys witn 489 MMPI Cones at EACH or Seven LEVELS or FAMILY DISRUPTION 
Dve TO Disease WHo BECAME DELINQUENT AT A BASE RATE oF 43% DELINQUENT 
PREDICTION Ratios 
Delinquents Nondelinquents 

Number of aa a 

Disruptions Est. cf At Est.cf pa q: Pp, Qp: Qq: Hr Rp Ba 
0 235 1.00 315 1.00 .00 43 57 00 43 1.00 a 
1 57 24 51 16 84 10 09 AS 58 20 2 
2 35 a5 16 .05 .95 .06 -03 54 61 09 ‘38 
3 27 BE? 4 .01 .99 .05 -01 7 OL 06 ‘g1 
4 16 .07 4 .01 .99 -03 -01 57 -60 04 79 
5 15 -06 4 O1 99 03 OL Ba 4 59 03 “68 
6 8 -03 4 O1 99 02 O1 57 .58 02 100 
7+ 4 02 00 00 1.00 01 00 at 8 OL e 

pnn 


Note.—Apparent discrepancies in table are due to rounding figures. 


tact would be an asset. It should be noted, 
however, that this does not aid much in pre- 
diction of nonodelinquency. 

When MMPI code is taken into account, 
accuracy in prediction of delinquency is in- 
Creased even more. The cases with 489 codes 
have a frequency of 28% in the population. 
The complete results for the 489 cases are 
presented in Tables 1 and 2 for the two dif- 
ferent levels of delinquent severity. This is a 
demonstration of the prediction model de- 
veloped by Meehl and Rosen (1955). Here 
the first column indicates the score (i.e., the 
number of disruptions in the family due to 
disease) from 0 through 7 or more. The sec- 
ond column gives the cumulative frequencies 
of these 489 cases who became delinquent at 


each social agency score from a maximum of 
7 or more points to a minimum of 0; these 
are estimates of the population cumulative 
frequencies. The column entitled p, trans- 
forms these cumulative frequencies to pro 
portions based on the total number of 489 
cases that became delinquent. The third col- 


umn gives the cumulative frequencies for | 


those 489 cases that did not become delin- 
quent at each social agency score. Correspond- 
ingly, the column $2 transforms these cumu” 
lative frequencies to Proportions based 0” 
the total number of 489 nondelinquents. 
For example, a pı of .17 at a family dis 
ruption score of 2 indicates that 17% of the 
489 cases who became delinquent scored 2 oF 
more points on this particular social agencyY 


TABLE 2 


Proportion or Boys witit 489 M 


MPI Copes at Eacn OF SEVEN LEVELS or F 
DveE to Disease Wuo BECAME DELINQUENT 


PREDICTION Ratios 


AMILY DISRUPTION 
AT A BASE RATE of 35% DELINQUENT 


a — Oates 
Delinquents Nondelinquents 

Number of — Ap 

Disruptions Est. cf pı Est. as pe m Po Op: n i Be Lae 
0 191 1.00 359 1.00 00 35 65 1.00 35 1.00 3 

1 45 24 63 18 82 08s i 54 62 20 s 

2 33 A7 18 05 95 .06 .03 62 “68 ‘09 E 

3 25 13 7 02 8 05 m Ce ‘06 T 

4 15 .08 5 01 98 03 oy O04 67 04 "9 

5 15 08 4 O1 99 03 O01 64 67 03 a 

6 8 04 4 01 99 02 01 64 66 ‘02 oF 
74 4 02 0 00 100 01 o 65 66 oo 

i _ Se Zoe 


Note.—Apparent discrepancies in table are due to rounding figures. 
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scale. Similarly, a fə of .05 indicates that 


> only 5° of the nondelinquents scored 2 or 


More points on this scale. 

The proportions given in pı and p> do not 
take into account the base rates. 

Column gə represents the valid negative 
tate before the base rates are taken into ac- 
count. This indicates the proportion of all 


| 489 nondelinquents that are appropriately 


labeled “nondelinquent” at each social agency 
scale score. The column entitled Ppi is Pı 
multiplied by the base rate of delinquency 
for 489s and represents the valid positive rate. 
The column Qps is pẹ multiplied by the base 
rate of nondelinquency and represents the 
Valid false positive rate. Qq2 is gz multiplied 
by the base rate of nondelinquency and rep- 
resents the valid negative rate. Hy is Pp, + 
Og» and represents the overall accuracy of 
Prediction, which is the accuracy with which 
one can call individuals falling above the given 
Cutting scores nondelinquents and st 
and below it delinquents. The column = e 
Rp is Pp, + Ops and represents a sort 0 a 
lection ratio telling the proportion of pape 
for whom one is predicting at each level on 


i i a fI 
the social agency rating (i.e at a D pd 
oF gr of 2 or greater, etc.). < 
cag ont /Rp and it repre- 


finally, a column Hp is Phi 

Sents the proportion of people at each. level 
on the social agency rating who se gam 
quent. This is the accuracy with whic ber 
can predict delinquency alone with — aa 
Or false negatives. It is the most im tthe A 
tistic if one is trying to select a sma samp 


for treatm 
atment. _ 
These tables show the advantage 0. B 
ceeding from a high base rate of í eli oe a 
he selection ratio, Rp, cage het Re 3 
centage of people for whom pr 


preci- 
m es not change apprec 
ade at each level do (compare Rp in 


ably with the change in rate i 
able 1 with Rp in Table 2). ee pal 

CUuracy of delinquency predicnon; < aa eich 
€refore the percentage of peop 


jho are cor- 
evi K n nt and who 
el who are delinque as the percentage 


rectly identified) increases ulation. It 

delinquents increase in the mo Aa 
Should be noted that at the more a 
teria, with a 35% delinquency a E nity 
Overal] population, it is possible to 


i t although 
à sample that is 79% delinquen 
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Wiicn Woutp Br Founp DELIN- 
QUENT FOR THREE TYPES OF SELECTION STARTING 
with 1,000 Boys Cu N RANDOMLY 


Success at Each Rate 


N % N 

Severity Selected Del. Del. 

Random Severe 1000 32 316 
Mild 1000 41 408 

All 489 Severe 281 35 98 
Mild 281 43 120 

489 and Severe 16 79 13 
> 3 Disruptions Mild 16 88 14 


this group represents only approximately 6% 
of the 489 cases. At the less severe criteria of 
delinquency where the overall rate is 43%, it 
is possible to identify a sample that is almost 
88% delinquent and again it includes only 
6% of the cases in the 489 population. 

The success with which delinquency can be 
predicted based on a hypothetical 1,000 cases 
using the information described above is 
shown in Table 3. Depending upon the needs 
of the individual situation, it is possible to 
predict that 316 boys in a population of 1,000 
would be fairly severely delinquent and as 
useful information is added, one can follow 
the 1,000 cases down to the precise but re- 
stricted sample obtained through the study of 
16 cases of whom 14 will be delinquent. 


Discussion 


The principles involved in selection de- 
scribed here are empirical rather than theo- 
retical and do not carry general implications 
concerning the cases selected. It would be 
judged from other data not presented that a 
number of such techniques could be developed 
using different factors within the histories of 
delinquent boys. Some such techniques would 
probably be better than others and some 
would be more stable than others. Without 
theoretical knowledge of the reasons for the 
operation of each of the particular criteria 
established, it would be impossible to know 
when environmental or cultural changes would 
affect the validity of the techniques that are 
presented or developed. These tend to be the 
risks involved in using such an approach as 
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the present one. Of course, it is not impos- 
sible to build in safeguards, to define popula- 
tions, and to cross-validate from time to time, 
in order to raise the likelihood that one is 
not practicing an impoverished statistical 
ritual. 

It was indicated earlier that it is likely that 
delinquency is not a homogeneous psycho- 
gical variable. It is obvious that the selection 
of delinquents through some particular set of 
criteria will net a very special subpopulation 
of delinquents, a population which is not a 
random sample of all delinquents. Therefore, 
the research worker employing selection tech- 
niques based upon factors which heighten the 
likelihood of delinquency in a specific way 
will tend to meet with a specific type of case. 
This should be an advantage since there is 
some insurance of homogeneity within the 
subpopulation of delinquency besides the like- 
lihood of future delinquency. Furthermore, 
the procedures themselves, although not theo- 
retically involved in the understanding of 
delinquency, certainly suggest some factors 
which are important in the development of 
delinquency within the cases selected. There- 
fore, within the framework of criteria de- 
scribed in this paper, where delinquency 
seems to result from a combination of pres- 
ent personality status (characterized by poor 
judgment, excitability, and a certain unrealis- 
tic approach to the events of life) plus an ab- 
normal home in which tragedy and disease 
have left their permanent marks, suggestions 
for treatment are not as hard to make as if 
one were dealing with the random delinquent. 
Furthermore, one would suspect that cases se- 
lected in the same way might require similar 
treatment programs since such selection is a 
quasidiagnostic procedure. 


SUMMARY 


A technique for the discovery and identifi- 
cation of potentially delinquent boys was 
illustrated in this paper. A sample of 13-year- 
old boys, drawn randomly from a general 
urban population, was evaluated using the 
MMPI and a survey of their family histories 


obtained from social agencies. A multiple cri- 
teria approach to the identification of the pre- 
delinquent case was developed, starting with 
cases from the general population. To this the 
factors of MMPI codes and instances of se- 
vere disease or of death among members of 
the family were added. Using these two cri- 
teria, it was possible to develop small sub- 
populations which were about 80% saturate 
with predelinquent boys. Such subsamples 
when compared with the general population 
were approximately twice as dense with preg 
delinquent cases since the general population 
had a rate of about 40%. r 

The possibility of using this technique 1" 
the establishment of treatment programs 
where small samples could be handled 1” 
areas where large numbers of delinquents ar? 
found was discussed. It was pointed out tha 
such subpopulations would not be rane 
samples of delinquents, but that such su i 
populations would in fact be more honga 
geneous subsamples of delinquents than 4 
usually obtained. 
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4. If Hypothesis 3 is supported, then thos 
Ss who draw the opposite sex first will h i 
an ideal self-concept resembling that of a 
parent of the same sex more than Ss wh s 
the same sex first. Wa 


METHOD 


One hundred and fourtee: g 
students (57 females, 37. er ee SE 
Draw-A-Person Test and the Leary rile Sa 
Check List (1956). The mean age for the pesoni] 
was 20.54 years and for the males almo: TA 
years. This difference in age is significant aE ane 
level. Although our male and female sampl di fe 
significantly in age, it is felt that this is noe aie 
ous drawback, particularly in view of aoe et 
Newton’s (1955) study which indicated that BP on 
ferences were not a significant factor in a oe 
entiation beyond the eighth grade. The ps ee 
of college education for the females was 23 EE 
for the males 2.58, while for the two co: ite m 
was 2.75. The difference in years of ed io EF 5 
significant (t = 1.46). nenten ia not 
On the Leary each S was i 
disagree with 128 iia Psa acon ha eel 
he would use in describing himself, his aah bigs 
father, and his ideal self. When inate ee 
A-Person Test, careful note was made of oe pee 
the first drawn figure. This was to be the b. ke i 
which the groups would be divided. Thirt S oie 
57 females drew a male figure first, whil o 
drew the female figure first. The difi con hea 
significant (chi square = .40). ene 
The template scoring s; 
Leary protocol. i. oe Sno ae 
vector was computed. The means and pe Tavs 
the vectors were computed for the vari ee 
Table 1 gives these data. These grou Seen 
males drawing their own sex first (FH, neudet 
ing the opposite sex first (Fm) maole those draw- 
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TABLE 1 | 
MEANS AND SIGMAS OF VECTORS For Four RATINGS OF ALL GROUPS 
Mother Father Ideal Self 
Dominance Love Dominance Love Dominance Love 
Group M SD M SD Mw SD m ‘sp “M SD M SD 
f 61.25 9.12 3.73 
En 61.16 9,79 6.97 
F total 61.29 9,30 4.76 
Mm 61.22 4 8.92 6.06 
Mf 59.37 47.06 12.85 5.96 
M total 60.70 49.16 10.26 6.02 
Total M&F 56.14 9.21 50.68 7.32 60.99 7.15 55.55 8.81 63.35 7.96 48.99 9.80 65.23 5.09 


ratings on Dominance 
groups. 


and Love vectors for all 


RESULTS 


Few differences between group means were 
significant. Out of 24 ts, only two reached 
the 5% level: men vs. women, on self-con- 
cepts scored for Love, and women drawing 
same sex first vs. women drawing opposite 
sex first, on mother concept scored for Love. 
These last findings could well be chance fluc- 
tuations, 

The groups as a whole (considered on the 
basis of their sex differences only) showed a 
difference between the means of the Love 
vectors (p= .05) on the self-rating scale. 
Examination of Table 1 shows us that the fe- 
males rate themselves higher on those dimen- 
sions tapped by the Love vector, while males 
scored higher on the Dominance vector. The 
implications of this will be dealt with in the 
discussion section. 

In Table 2 we can examine the significance 
tests between the means of the various rat- 
ings for the Dominance and Love vectors for 


the Ff, Fm, Mm, and Mf groups. The Ff row 
shows us that the only difference which is not | 
significant for the Dominance vector is that 
between Father and Ideal Self. Two differ- 
ences fail to reach significance, however, for 
the Love vector, Self-Father, and Mother 
Ideal Self. The Fm group shows a differen 
pattern from the Ff in that the Self-Mother 
ratings on the Dominance vector are not sig- 
nificantly different. On the other ratings 1 
this vector these two groups are essentially 
similar. For the Love vector the Fm group Í$ 
quite similar on most of the ratings with the 
exception of the Self-Ideal Self ratings. The 
Fm sample does not differ on these two rat 
ings, while for the Ff sample the difference 
between Self and Ideal Self i 
cant (p = .001). l 
The males who drew the male figures first 

(Mm) have significant differences between all 
the ratings on the Dominance vector except 
Father-Ideal Self, a finding common to 
and Fm groups as well. Two comparisons ° 
ratings have small differences on the Lov® 


s highly signif 


TABLE 2 


ls BASED ON Mr 


AN DIFFERENCES BETWEEN 
RATINGS For VARIOUS GRO 


INDICATED PAIRS OF 
UPS 


Dominance Vector 


= Love Vector 
Self- Self- Self- Mother- Father- Self- elf- el F is 
Group mother father Ideal Self Ideal Self Ideal Seif mother father Ideal Self deat Sai penl Sel 
Ff —6.95** —4.18** —3.34* +2.55 5.7 : X 
Fm 3.62 5.69% —Si54e 15°93 S.79%% zias 
F total —6.19%% —4153%* Z384 +453 
Mm z i a73% —6.83%* pores 
Mf - 25 eee +0.83 
M total —3.14* 3.88% 6.19% Fe 
M&F total —4.85** —4.347* 4.87% +0:06 


* Significant at .05 level. 
** Significant at .01 level. 


4s no significant difference between F 
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vector for the Mm group. The Self-Father 
ratings show no significant differences, as do 
the Mother-Ideal Self ratings. The overall 
pattern of significant differences is largely 
similar for both groups drawing their own 
sex first in both the Dominance and Love 
vectors. 

Those males drawing the female first (Mf) 
had no significant difference between ratings 
of Self and Mother on the Dominance vector, 
much as the Fm group also showed. And once 


again, as each group thus far has shown, there 
ather 


and Ideal Self. 
_ Four ratings show no s 
in the Love vector for 
Self-Mother, Self-Father, Self-Ideal Self, ar 
Mother-Ideal Self ratings are without the aia 
nificant differences which characterize ca 8 
males. The most outstanding differences nl 
tween the two male groups On the To Pe 
are on the ratings between Self and : oner 
and between Self and Ideal Self, with the ^ 
males showing the greatest differences. sai 
Again the overall pattern of significan bok 
ferences is practically identical for Lae 
Stoups drawing the opposite sex pit aa 
ing which applied to both vectors: mm 
appear that the sex of the S doing oe a 
ing does not differentiate the interrating ew 
Parisons as much as the sex order in V 


the figur 

es are drawn. ee 
An inspection of the column > m l 
Parisons between parent ratings an ad Th the 
Self ratings reveals a consistent trend. 
igooinance vector it is note 
ideally tend to identify (i-€- 

al Self 

Cant differences between Father and Ide: 


igniñcant differences 


the Mf group. The 
f, and 


Sept in one instance where t 

:fcati ween 
Show some degree of identification betv 
“ather and Ideal Self. A ae in the Love 
Significance tests was obtaine 


i groups 
Vector except that in this caa tie p o 
Panimously wanted to identi JN = 
ont qualities perceived in their 
Ot in their fathers. 
Discussion 


Our stud ds to support 
y tends 
ade by Machover ( 1949). 


the statements 
On the other 
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hand, our results are not in line with those 
studies suggesting that certain conclusions 
cannot be drawn from a sample which draws 
its own sex first as against a sample which 
draws the opposite sex first. We find marked 
differences between these two groups. These 
differences occur in their perception of their 
self and ideal self as compared to their rat- 
ings of their parents. These are not gross dif- 
ferences but are, rather, of a selective nature. 
Some appear only in relation to qualities of 
dominance, others to love. 

We are in a position now to examine our 
original hypotheses. Let us repeat each and 
see if it is upheld or denied. 

Hypothesis 1. Ss drawing the opposite sex 
first have different self-concepts (as revealed 
by the Leary Interpersonal Check List) from 
Ss drawing the same sex first. 

This was not supported. The self-concept is 
measured by the Dominance and Love vectors, 
and the means on these vectors were not dis- 
similar enough to support this hypothesis, al- 
though when the sexes were combined, the ¢ 
test done between the mean female Love score 
and the mean male Love score differed at the 
.05 level. This could be indicative of a sex 
difference, although it may be attributable to 
the larger N. 

Hypothesis 2. Ss drawing the opposite sex 
first have greater similarity of self-concept 
with their conception of the parent of the op- 
posite sex than Ss drawing the same sex first. 

The Fm group rates the self more like the 
father on dominance qualities than does the 
Mf group Ss who also rate their self more 
like their mothers on dominance qualities than 
does the Mm group. The hypothesis is, there- 
fore, upheld at least for Dominance vectors. 

On the Love vector the hypothesis is only 
partially upheld. Ff and Fm groups are not 
different in how they perceive their self from 
their fathers. But there is a significant differ- 
ence between how the men rate themselves 
and their mothers. The Mf group sees its self 
having love qualities like their mothers’ love 
qualities, but the Mm group is significantly 
different in this respect. 

Hypothesis 3. Ss drawing the opposite sex 
first will show a greater discrepancy of the 
self-concept with their ideal self-concept than 
Ss drawing the same sex first. 
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This is not confirmed. Although on the 
Dominance vector males and females both 
show a significant difference between their self 
and ideal self-concepts, this difference is by 
far more pronounced for Ff and Mm groups. 
On the Love vector the disagreement between 
self and ideal is in fact seen only for Ff and 
Mm groups. This could suggest that Ff and 
Mm groups have higher aspirations and are 
more ambitious for themselves, thereby not 
being easily satisfied. Or it could mean that 
drawing the Opposite sex first reflects a 
healthier acceptance of one’s goals by either 
being more capable of attaining the ideal, or 
bringing the ideal down more in keeping with 
the way one finds himself. This would cer- 
tainly be a worthwhile subject for further re- 
search, especially in view of the work of 
Butler and Haigh (1954) in which thera- 
peutic progress was found to be a decrease in 
discrepancy between the self and ideal self- 
concepts. 

Hypothesis 4. This hypothesis was contin- 
gent upon the confirmation of Hypothesis 3 
and is hence rejected also. 

The hypotheses originally proposed have 
failed to deal with all of the resulting data. 
A few additional comments are in order. 

The most outstanding finding from this 
study reveals that Ss of either sex, if they 
draw their own sex first, will tend to be 
basically similar regarding perceptions of self- 
ideal self, and parents, This also tends to be 
true for all Ss drawing the opposite sex first, 
in that they too agree with each other on 
these ratings, 

Although most of the mean difference com- 
parisons yielded significant #’s, none resulted 
for the Father-Ideal Self comparisons on the 
Dominance vector for all four groups, sug- 
gesting that all Ss ideally want to be like the 
dominant qualities they perceive in their fa- 
thers. On the Love vector the Self-Father and 
the Mother-Ideal Self comparisons failed to 
reach significance in any of the groups, im- 
plying that all Ss attribute similar love quali- 
ties to themselves and their fathers, although 
ideally they would like to have the love quali- 
ties perceived in their mothers. 

We conclude that Machover’s contention 


has at least some inferential support from 
this study. 


SUMMARY 


One hundred and fourteen college under- 
graduate subjects, 57 males and 57 females, 
with an average college education of 2.5 years 
were given the Draw-A-Person Test and h 
Leary Interpersonal Check List. The sex 0 
the first drawn figure was noted and the rat- 
ings of the subjects for themselves, mothers, 
fathers, and ideal selves were scored and coa 
pared according to Leary’s Dominance an / 
Love vectors. if 

An analysis of the data suggests that dif- 
ferences exist between the groups drawing 
their own sex first from groups drawing the 
opposite sex first. Four hypotheses were testi 
and research suggestions made from thes 
data. Of these four, three were rejected | 
only one was partially upheld. However, t 
pattern of differences between self-conceP 
and concepts of ideal, of father, and of mother 
in the various groups was such as to provid 


some inferential support for Machover’s p” | 
sition. 
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REPLICATED FACTORS ON THE MMPI WITH 
FEMALE NP PATIENTS 


WILLIAM J. EICHMAN 


Veterans Administration Hospital, Roanoke, Virginia 


Although factor analytic techniques have 
_ been applied to Minnesota Multiphasic Per- 
sonality Inventory (MMPI) scales on a num- 
ber of occasions, there is no study in the lit- 
erature which employs se NP el ones as 
a subject group. Consequently, it is unknown 
whether ihe factorial structure of the MMPI 
with female patients is different from that 
with male patients. Studies with male sub- 
jects indicate essential agreement as to the 
loadings on the first two factors although the 
interpretations of the factors differ from one 
study to the next. Welsh (1956) seems to 
have conceptualized the two dimensions most 
adequately as anxiety and repression. He de 
veloped item scales for the two factors an 

labeled them A and R. More than two fac- 
tors have been found in all reported studies 
but the loadings have differed from one study 
to the next, Welsh (1956) identified two fur- 
ther factors in his study and developed item 
scales for them. These scales were difñcult 
to interpret and have received little fur- 


ther attention. Wheeler, re i meye 
“ id adjus 
(1951) found “paranoi factors. Kasse- 


“psychopathic adjustment” 
baum, Couch, and Slater 
third factor which they 
minded sensitivity.” 

With the single exception of Welsh, no i 
has attempted to make practical use > a 
factor studies. Authors have been a ml 
Versally critical of the MMPI as a eeure 
instrument because it seemed to me 
only two variables and took 12 or mpe ; a 
to do the job. To a very large mre aaa 
Criticism is a justifiable one althoug' ase 

© noted that a number of the i les have 
Only moderate communalities 1 the tal ka 
factor loadings. An example of this Is ee 
in the recent study of Kassebaum 5 


(1959) found -a 
labeled “tender 


55 


(1959) in which 6 of the 12 scales have com- 
munalities below .50 after the extraction of 
three factors. These are the L, F, Hs, Hy, Mf, 
and Pa scales. Thus many of the scales have 
extremely large error variances or measure 
something which is unique to the particular 
scale. The positive results of a large number 
of predictive studies using these scales would 
support the latter interpretation. The end re- 
sult of the factor studies on the one hand and 
the predictive studies on the other leaves the 
practising clinician in a state of confusion. 

The present study attempts to identify fur- 
ther factors in samples of female NP patients 
at a VA hospital. One of the flaws in previous 
studies is that few have been replicated using 
the same matrix of scales. The authors have 
hesitated to accept more than the first two 
factors as significant. Some have not used the 
validity scales (L, F, K), and others have 
employed a variety of additional scales be- 
yond the 9 original scales with little congru- 
ence of scales from one study to another. 
This study utilizes 17 scales in two samples 
of NP females. 


METHOD 


Subjects. Two samples (Ns = 62, 85) of female 
patients were used. Aside from the fact that the first 
group of records was collected earlier than the sec- 
ond, the two groups did not appear to differ. The 
mean age of the total group was 35.4 with an SD 
of 8.3. Length of hospitalization ranged from a few 
days to 10 years. The diagnostic groups represented 
are presented in Table 1. Mean scores on the MMPI 
scales for the combined sample are presented in 
Table 2. Tests of significance between the groups on 
each of the 17 scales showed a difference only on the 
Mf scale and this was small in absolute magnitude. 

Scales. Seventeen MMPI scales were used in each 
sample. These were the 3 validating scales, the 9 
clinical scales, the Taylor A scale (Ar) (Taylor, 
1953), the A and R scales (Aw and Rw) (Welsh, 
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TABLE 1 


Primary DIAGNOSES or Two SAMPLES 
or NP FEMALES 


Sample A Sample B 

(N = 62) (N = 85) 
Schizophrenia 32 45 
Manic-Depressive 2 6 
Psychotic Depression 1 1 
Schizophrenic, Post-lobotomy 3 0 

Character and Personality 

Disorders 3 7 
Neuroses 17 21 
Organics 4 3 
Unclassified 0 2 
62 85 


| 
| 


1956), the Dependency scale (Dp) (Navran, 1954), 

and the Ego Strength scale (Es) (Barron, 1953). 
Statistical technique. Raw scores were used for 

obtaining product-moment correlations. The factor 


method. All rotations were done by simple graphical 
procedures. 


RESULTS 


Correlation matrices for the separate sam- 
ples are not presented since the only purpose 
in using two analyses was to assess the sta 
bility of factor loadings. 

Four factors were extracted from the cor 
relation matrix of each sample. Further fac 
tors were not extracted since the average 
residual value in the fourth residual ne 
was .02. The largest residual value found 4! 
this time was .08. The centroid loadings ar 
presented in Table 3. Orthogonal rotations 
were then carried out by simple graphical pr 
cedures. The rotated loadings are amen, 
in Table 4. Plotting the test loadings on th 
factor vectors quickly indicated that a vey 
close approximation to simple structure oa 
be obtained by maximizing the loadings 


actors: 
analysis was carried out by Thurstone’s centroid Aw and Rw for the first and second facto 
TABLE 2 
MEAN MMPI T Scores FOR TOTAL SAMPLE 

(N = 147) y 

L F K Hs D Hy Pa Mf Pa Pi Sc Ma 

S430 SAT 55405726. 63.3 66.0 498 634 55.6 586 58.6 

TABLE 3 

a. CENTROID Factor LOADINGS For SAMPLES A anD B A 
Q = aes, $ = f. | 
Factor I Factor II Factor III Factor IV E 
A B A B A B A B a 2 
On 
L —48 —46 +50 +27 17 —32 32 fh 
0 -= -12 +1 52 y 
F +66 +63 +45 —40 —40 —42 +13 T 82 H 
K —70 —66 +41 453 +14 —14 —18 +25 my 
Hs +80 +67 +34 +442 =13 —35 —34 —30 89 &s 
D +69 +474 +45 +440 +36 +23 +21 +30 85 
Hy 60 +33 +52 +73 +05 —23 —39 —03 78 f 
Pa +65 +72 +4 1] -17 —07 +20 432 51 I 
Mf +26 +17 —17 +424 +16 415 —32 —13 22 uy 
Pa +70 +76 +16 —12 —17 ~25 —08 +06 55 6 
Pi +91 +094 +09 —02 +27 424 +20 402 95 9 
Së +89 +491 +18 —26 —17 ~18 25; Aio 92 %, 
Ma +42 +36 -31 —63 —42 —28 -2 j 40 & 
Ap +95 +493 +02 +10 +22 +426 ~03 —07 95 9 
ie +92 +92 —14 —20 +26 +23 +17 403 96 9; 
Rw —06 +14 +66 +62 +37 +10 411 434 59 7 
Dp +88 +91 —22 —10 +32 +26 +18 —04 96 A 
Ts —86 —80 —10 -14 12 +18 +10 415 a7 © 
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TABLE 4 
ROTATED FACTOR Loapincs FOR SAMPLES A AND B 

Factor I Factor II Factor III Factor IV Ie 
x 2B a B AB A B A B 
i 59 —58 +34 +20 +19 +07 +17 +24 53 44 
F 447 +58 +20 —23 +34 +16 +66 +59 81 76 
K =72 16 +40 +47 +05 —03 =17 +07 71 80 
As +461 +45 +16 +27 +66 +74 +21 00 88 82 
D +71 +68 +57 +62 +06 +02 +11 +01 S4 85 
ly 443: +11 +38 +63 +65 +53 +07 —01 76 69 
Pa 458 +69 +05 +13 +12 —01 +40 +35 51 62 
Mj 195 Tia —13 +19 +25 +12 —29 +25 23 13 
Pa 457 +68 +03 —01 +38 +26 +28 +33 55 64 
Pi A Fos 422 +16 +03 +08 +09 —05 94 94 
Se ee +88 +09 —08 +15 +18 +48 +35 92 94 
a “a i. veg 6h +32 +14 +21 +26 50 65 
ir Log 92 +10 +22 +25 +16 —01 —14 96 94 
‘Ae a ae 402 00 o0 00 o0 00 96 94 
R +98 +97 477 +73 —01 00 —02 00 60 53 
D —=05 00 Tos +04 —06 —09 —07 +05 95 90 
b +97 +94 L13 —32 —47 —05 —09 7 7 

is -81 —68 ail is 


respectively, This procedure has the addi- 


tional advantage of objectifying the P 
of rotation for the two samples and ma A 
the comparison of the two analyses kpr 
Meaningful. Thus only the locations > 


å and IV had to 
reference axes for Factors HI ans. This last 


be determined by subjective ™ dam sith 
fotation for each sample WAS ice in 

urstone’s principle of simple 5 i the four 
mind. Discussion and comparison © 


factors are presented below. 


Comparison of Factor Loadings sinbo 
Factor I, The first factor See ar 
Samples is obviously the same an MMPI. 
M previous factor analyses o! i ° Ar, Dp, 
oadings above .9 are found on 4 Y he pairs 
and Pt in both samples. Correlating 


fficient of 
of loadings results in a Pearson coe 


98 and a rho of .96. iti 
actor II, Significant t0 ; m- 
are found for six scales of Factor am 
Dle A and for five of these same SX respective 
- In order of magnitude, with the © are Rw 
loadings in parentheses, the a —.64) 
(77, 73), D (.87, 62), Ma (— "4, 20). 
E (40, 47), Hy (38, -63)> 4 OF toes is 
e Pearson r for the 17 P a ble corre- 
and rho is .74. Thus considera the two 
Pondence is found for this factor a 


ngs (above 3) 


samples. The scales which have the highest 
loadings also indicate that the factor is highly 
similar to the second or R factor found in 
previous experiments. 

Factor III. The third factor showed six sig- 
nificant loadings in Sample A but only three 
in Sample B. Those scales which had signifi- 
cant loadings in both analyses were Hs (.66, 
.74), Hy (.65, 53), and Es (—.32, —.47). 
The three scales which did not appear to be 
significant in the second sample were Pa (.38, 
26), F (.34, .16), and Ma (.32, 14). That 
this difference between the two analyses is 
more apparent than real is indicated by a 
Pearson 7 of .95 between the pairs of loadings 
and a rho of .89. As a consequence, it was ac- 
cepted that the factor was satisfactorily repli- 
cated. 

Factor IV. Three scales attain significance 
on the fourth factor in both samples. These 
are F (.66, .59), Sc (.48, 35), and Pd (.40, 
.35). An additional scale, Pa (.28, .33), passes 
the significance point in Sample B. Although 
this factor can be accepted as similar in both 
samples, the correspondence is not nearly so 
great as in the other factors. The Pearson r 
is .70 and the rho is .58 between the sets of 
loadings. Fairly large differences in absolute 
loadings occurred on two scales, Mf (—.29, 
.24) and K (—.17, .07); and large relative 


58 William J. Eichmen 


differences occurred on others, e.g., Hs from 
fifth highest to twelfth highest, Pt from ninth 
to fifth highest, and Dp from fifteenth to 
ninth highest. 


Combined Results 


Since close identity was found for the first 
three factors in both samples and similarity 
was found for the fourth, the two samples 
were combined and the results were reana- 
lyzed as described previously. The correlation 
matrix for the combined sample, the centroid 
loadings for the data, and the rotated load- 
ings are presented elsewhere.? 


Interpretation of Factors 


Factor I. This factor accounts for 63.1% 
of the common factor variance in the table 
of intercorrelation. The high loadings on Ay, 
Ar, Dp, and Pt indicate that it is identical 
with factors found in previous studies. It ap- 
pears to be a general maladjustment, anxiety, 
and/or complaint factor. 

Factor II. The second factor accounts for 
16.0% of the common factor variance. Since 
it has its greatest loading on Rw, we may as- 
sume that it is highly similar, if not identical, 
to Welsh’s Tepressive-expressive factor, The 
other scales with high loadings are D (.62), 
Hy (.56), Ma (—.55), and K (.41). These 
loadings are Consistent with the interpreta- 
tion of a bipolar factor with repression at one 
extreme and expression at the other, 

Factor III. This facto 
9.6% of the common fac! 


1 Tables of the correlation 
ing, and rotated loadings fo; 
ple have been deposited wit 
mentation Institute. Order D 
ADI Auxiliary Publications 
tion Service, Library of Con 
D. C., remitting in advance $ 
film or $1.25 for 6 X 8 in, ph 
payable to: Chief, Photodupl 
of Congress, 


matrix, centroid load- 
r the combined sam- 
h the American Docu- 
ocument No. 6465 from 
Project, Photoduplica- 
gress; Washington 25, 
1.25 for 35 mm. micro- 
otocopies, Make checks 
ication Service, Library 


somatization variable. There may be othe 
correlates of the factor such as are repre 
sented by the subtle items of the Hy scal 
(Wiener, 1948), those whose content E 
with character formation rather than m 
physical symptomatology, but this cannot 
established by the present study. ad 

Factor IV. Factor IV accounts for 11.374 
of the common factor variance in the stl 
but is apparently stable from one sarple y 
another. High loadings are found on i 
scales: F (.70), Sc (.48), Pa (.44), nf 
(.37), and Pd (.34). These scales wW. d 
found to be high on a clinical profile are a 
ally interpreted as representing a a 
level of functioning and/or a potential d 
the acting-out of impulses. The writer pri 
fers to label the factor as an acting-out a, 
ency at the present time since clinical expe A 
ence indicates that many patients with om 
acter disorders also have high scores on the 
scales. 


Comparison with Other Studies 


scat 

Two studies were selected for comparist 
with the present study on the basis that sit” 
lar scales were included in the matrix. BO 


and female subjects. The 4" 
and Rw factors i 


show such strik 

Somatization 
third factor “te 
tially because 
cluded in the 
used in both 


e 
e similarity exists between the 
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r Study it was .83. Thus some correspondence 
- among factors on all three studies was found. 
: This correspondence was at a maximum when 
; the subjects were psychiatric patients and at 
| & minimum when the subjects were college 
| Students. Closer correspondence could prob- 
ably be obtained by rotating the factors of 
Other studies so that Aw and Rw would be 
Jat a maximum as was done in this study. 

| Acting-out factor, The fourth factor found 
in this study resembles a fifth factor found 
by Fisher and described as a social alienation 
factor with psychotic implications. The corre- 
spondence between the acting-out factor de- 
| scribed above and Fisher’s social alienation 
factor in the psychiatric sample is particu- 
larly close, in both cases the same five scales 
have loadings above .3. These are F, Pd, Sc, 
Pa, and Ma. The Pearson 7 for the 16 pairs 
of loadings is .70. As indicated previously, 
rotations maximizing Aw and Rw would likely 
result in an increased correlation. 


DiscussION 


The most important finding of this study 
appears to be the stability of factor loadings 
on replication with a similar sample of sub- 
jects. The somatization and acting-out fac- 
tors found here each account for approxi- 
mately 10% of the common factor variance, 
an amount which several investigators have 
deemed not worthy of interpretation. Never- 
theless, the factors, as presently interpreted, 
Seem to represent frequently observed beray. 
lors in patients and appear to be worthy : 
objective measurement. The finding that the 
factors account for so little of the common 
factor variance is the result of the original 
Construction of the test. For example, i 
Scales were constructed separately to measure 
the different psychophysiological reactions, ve 
Might expect to find a common factor 0 
Somatization which would account for a oe 
arge amount of the comm riance of the 
i If, in the same tab ; 
„ons, onl measures © 
iustment AE eta, we would find that 
factor accounted fo 

mmon variance. 

e Bee consiatle that the MMPI, as ae a 
ently scored with the clinical and vali ity 
Scales, is overloaded with measures of ape 

@ladjustment and that other more pur 


scales can be profitably constructed. The 
second factor found in this study can be 
measured fairly well by the Rw scale. The 
third factor (somatization) can be fairly well 
measured by the altitudes of the Hs and Hy 
scales on the MMPI profile. The loadings of 
these scales on Factor III are greater than 
their corresponding loadings on the general 
maladjustment factor. The five scales which 
have respectable loadings on Factor IV (act- 
ing-out) have high loadings on the general 
maladjustment factor as well. Consequently 
it seems that this factor cannot be so easily 
distinguished from general maladjustment 
when the clinician is faced with an individual 
profile. Thus if we hope to measure acting- 
out potential with the MMPI, it seems very 
desirable to construct a scale which is uncon- 
taminated with the general maladjustment 
factor. 

The factorial composition of the Ar scale, 
the Dp scale, and the Es scale shows that 
these measures overlap to a very considerable 
extent with the Aw and Pż scales. A substan- 
tial body of literature has grown up around 
several of these scales as if something unique 
were being measured. It would seem wise for 
the person who develops a new scale to do 
correlational studies with already existing and 
validated scales. 

We can expect similar developments in the 
future, i.e., many new scales will be created 
and many of these will be near duplicates of 
those already in existence. An example of this 
is to be found in the recent paper by Kasse- 
baum et al. (1959) in which 19 nonclinical 
scales were included in a factor analysis with 
the original MMPI scales. Excluding Aw and 
Rw, the average factor loading of the remain- 
ing 17 scales on the general maladjustment 
factor is .66. The same average on the second 
or repression factor is .31. Squaring each of 
these average factor loadings and summing to 
arrive at a communality we arrive at a figure 
of .53. Thus approximately 50% of the total 
variance of these new scales is wasted on fac- 
tors which are measured much better by other 
scales. The supposed “nonclinical” scale is 
often as much affected by this contamination 
of general maladjustment and repression as 
is the clinical scale. In fact the nonclinical 
scale designed to arrive at the strengths of 
individual subjects is often a clinical scale 
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turned upside down. Examples drawn from 
the above mentioned paper include the Lead- 
ership scale with a loading of —.85 and the 
Tolerance scale with a loading of —.80 on 
the first factor. In the analysis presented 
here, the only extreme example is the Dp 
scale although the Es scale has a high load- 
ing on Factor I with a dash of Factor III 
(somatization) thrown in. 

The above discussion can be summed up as 
an argument against the endless accumulation 
of scales on personality inventories, scales 
which purport to measure one thing but 
which actually measure something else much 
better. The wiser course of action would seem 
to be checking out each item of a new scale 
against certain basic scales, particularly the 
general maladjustment factor in any of its 
several forms. Each new item should at least 
correlate more highly with the criterion 
against which the scale is being developed 
than it does with the general maladjustment 
variable. Without this precautionary measure, 
there will be a piling up of variance associ- 
ated with general maladjustment to the ex- 
tent that the end result is a good measure of 
the wrong thing. 

An argument can be made that the general 
maladjustment fac 
and this appears t 
has found at least 
an item analysis of 
previously indicated, i 


ment need to be 
that the gener. 


merely a global construct which would be 
more useful 


practical 
ew items 
subscales 
in order 
redictive 


SUMMARY 


Seventeen scales from the MMPIs of fe- 
male NP subjects were factor analyzed and 
replicated. Four factors emerged clearly in 
both samples. These were labeled tentatively 


as anxiety, repression, somatization, and att 
ing-out. Comparisons were made with two fat 
torial studies of male subjects and conside 
able correspondence was found for all fo 
factors. Similarity of factor structure W 
greatest when NP male patients were co 
pared with NP female patients and least vi 
male college students were compared with 
male NP patients. m j 
The results of the study seem to indica 
clear need for the construction of pure sei 
to measure the third (somatization) and i 
(acting-out) factors. Existing scales which s 
late to these dimensions of behavior are hig 
correlated with first and second factor sco! 
and cannot easily be interpreted. -tha 
A further implication of the results is l 
careless construction of new empirical a 
has resulted in near duplicates of the me 
scales which makes them relatively use veh 
Additional disadvantages are that these 2) 
scales are variously named according to ae 
particular criterion employed and that 1n Pi 
pendent bodies of research tend to build 
around them. 
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Yates (1954, p. 374) has voiced the criti- 
cism that many investigators seem to view 
brain damage as a “unitary factor.” Klebanoff, 
Singer, and Wilensky (1954) have suggested 
that a major reason for lack of agreement in 
the results of studies of psychological impair- 
ment related to organic brain damage or dis- 
ease in large part may reflect differences in 

type, locus, or severity of the brain lesions 
represented in the samples studied. Birch and 
Diller (1959) point out that “a clear view of 
the evidence is made difficult, or even 1mpos- 
sible, by the fact that the various parameters 
Of cerebral dysfunction have not been ex- 
amined systematically” (p- 188). Macroscopic 


and microscopic studies as reported in neu- 
Tology ee such as Wechsler (1958) 
f ity of brain 


indicate that the type or severi . 
esions may he on differences in the 
Organic condition of the brain. The detri- 
Mental effects upon adaptive abilities due to 
acutely destructive lesions such as intrinsic 
Umors or cerebral vascular accidents may be 
More dramatic than the effects of relatively 
Static conditions such as healed head wounds 
or slowly progressive conditions. The a 
study was designed to investigate psychologi- 
Cal deficits in relation to acuteness of organic 


Tain lesions. 


METHOD 
Subjects 


D: Four groups, each consisting 
SMtients, were studied. Correspon 


1 ie 
ferent investigation was S$ 
Ve ch Grant B-1468 from the 
Pupp Bical Diseases and Blindness, 

pe Health Service 

| { for 

© writers are indebted to Mary ioy TEES 
ance with the statistical computations. 


of 16 hospitalized 
ding subjects were 


in part by Re- 


ted 
uae itute of 


he National Inst 


assi, 


61 


United States - 


individually matched as closely as possible according 
to chronological age, sex, race, and years of educa- 
tion. Three groups were composed of patients diag- 
nosed as having organic brain damage or disease. 
Diagnoses were based upon detailed medical his- 
tory, electroencephalography, neurological examina- 
tion, and, when further clarification was needed, 
angiography, pneumography, and repeated neuro- 
logical examinations. The fourth, or Control group, 
was composed of patients in whom organic brain 
damage was confidently ruled out on the basis of 
similar, although generally less extensive, clinical 
diagnostic procedures. 

One brain damaged group (Acute) was composed 
of patients who had acute neurological illnesses and 
whose neurological signs and symptoms were pres- 
ent at the time of psychological testing. These pa- 
tients had experienced a specific, temporally defined, 
episode during which their current neurological find- 
ings had arisen, or had developed a rapidly progres- 
sive brain disease with steady progression of neuro- 
logical signs, A second brain damaged group (Rela- 
tively Static) was composed of patients who had 
either recovered from acute neurological signs if 
there had been an acute onset of symptoms, or who 
had slowly progressive brain disease without evi- 
dence of acute or sudden onset. Among this group, 
the patients with sudden onset of brain dysfunction 
(e.g., penetrating head injury) had with the Passage 
of time recovered from acute neurological deficits, 
suggesting reorganization of brain functions and a 
relatively static condition of the brain. The third 
brain damaged group (Chronic-Static) was com- 
posed of patients with chronic, long-standing brain 
dysfunction who were institutionalized in a state 
hospital for patients with neurological disorders. The 
diagnoses of all patients in this group included some 
form of epilepsy. None of the other groups included 
institutionalized patients. Diagnoses of the patients 
in the four groups are presented in Table 1. 

Differences between the mean ages and mean num- 
ber of years of education among the groups did not 
approach statistical significance. The mean ages, in 
years, were: Acute, 32.62 (SD 10.13); Relatively 
Static, 33.88 (SD 10.39) ; Chronic-Static, 32.88 (SD 
10.84) ; and Controls, 32.38 (SD 10.82). Mean years 
of education for the groups, in the same order, were: 
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TABLE 1 


DIAGNOSTIC DISTRIBUTIONS WITHIN BRAIN DAMAGED AND CONTROL GROUPS 


K. B. Fitzhugh, L. C. Fitzhugh, and R. M. Reitan 


Paraplegia 

Psychoneurosis 

Recurrent lumbar disc disorder 
Schizophrenic reaction 


Acute Relatively Static 

(N = 16) (N = 16) 
Acute subdural hematoma 1 Cerebral arteriosclerosis 1 
Astrocytoma 3 Chronic subdural hematoma 1 
Cerebral vascular accident 3 Closed head injury 1 
Glioblastoma multiforme 3 Cortical atrophy 2 
Metastatic carcinoma, cortical 2 Healed cortical abscess 1 
Postoperative arteriovascular Healed penetrating head wound 1 
malformation 1 Multiple sclerosis 5 
Preoperative meningosarcoma 1 Posttraumatic concussion 1 
Recent penetrating head injury 2 Psychomotor epilepsy 3 

Chronic-Static Controls 

(N = 16) (N = 16) 
Convulsive disorder due to Cancer of nasopharynx 1 
infectious disease 3 “Character disorder 2 
Convulsive disorder (grand mal) Facial laceration 1 

due to unknown cause 7 Neurological complaints without 

Posttraumatic convulsive CNS disease 2 
disorder 4 No clinical disorder found 1 
Psychomotor epilepsy 2 Non-CNS surgery 2 
2 
2 
1 
1 
1 


Superficial occipital osteoma 


9.69 (SD 2.36); 9.31 (SD 2.08); 9.00 ; 
and 9.06 (SD 3.03). ve a f 


Procedure 


All patients were adi 
vue Intelligence Scale 
measures described by 
of biological intelligen 
cators used were thos 
be the most sensitive 
jects with and subjects without eviden 
brain damage. Additionally, 
pairment Index) was com: 


ce of organic 
a composite score (Im- 
puted for each subject 


ACCORDING TO ACUTENESS 


o 
based upon the number of Halstead variables 
which the subject’s performance ranked within 5. 
range characteristic of brain damaged individua” 

In order to facilitate group comparisons é 
equalize variability on the several measures one 
variable, the raw scores from all groups were P an! 
and ranked poorest to best performance. These ™ (f 
were converted to normalized standard scores pý 
Scores). Since the groups had been equate co 
matching individuals, any two groups could be “yf, 
pared by calculating the mean of the T score i 
ferences between the corresponding individua” 
the two groups. This mean difference, in tur, oh 
evaluated by Student’s t. Also, in order to P" 


ND STANDARD DEVIATIONS 
OF LESIONS 


Relatively 


Acute Static ag Control y 

= ZT lke | 
Full 80.38 13.99 92.81 17.07 90.38 1 ae 1041 
Verbal 80.31 18.45 95.12 15.70 8888 1813 tonas 10 
Performance 84.44 13.88 91.81 18.27 93.75 1305 it 
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WECHSLER-BELLEVUE 


SCALE (FORM 1) 


5: 
T SCORES 
M 


r 
e 
7 


Fic. 1. Graphic presentati 
ables for control gr 


a familiar manner, mean 


the intelli A B 
igence quotients in I 
a i tions for these variables 


Scores and standard devia! 
Were computed from the raw scores. 


RESULTS 


Means and standard deviations of the Full, 
ale IQ scores are 


erbal, and Performance sC i 
Presented in Table 2. Only the Acute group's 
mean IQ scores were consistently below the 
range of 90-109. 

The general trends of nee g te 
our grou he Wechsler-Bellevue - 
ables ae te ot in Figure 1 The Control 
Stoup performed at levels consistently su- 
Perior to those of the brain damaged groups. 

The mean IQ score differences between Con- 
trols and brain damaged groups were signifi- 
Cant at, or well beyond, the 05 level (see 
Table 3). Among the brain damaged groups, 
Mean scores for the Acute lesion group were 
8enerally inferior to those of the two et 
esion groups. Mean Verbal IQ of the id 
tively Static group was higher than wa o 

e Acute group (p< -05)» and mean = 
Ormance IQ of the Chronic-Static group yas 

‘Sher than that of the b 
Two of the three brain damaged groups 
(Acute and Chronic-Static) obtained sightly 


eter Performance than Verbal mean 
Ores (see Table 3)- 


i 


A 
IN, 
i 


ee 


i 


on of mean T score values on Wechsler-Bellevue vari- 
oup and three brain damaged groups. 


On only 2 of the 11 Wechsler subtests did 

the mean difference scores between Controls 
and Acutes fail to exceed the .01 level of 
significance; and the mean difference scores 
were significant beyond the .05 level on those 
two subtests (Digit Span and Picture Ar- 
rangement). Comparisons between Controls 
and each of the static lesion groups also 
yielded significant differences on several sub- 
tests (see Table 3). 
_ The general trends of the four groups on 
the Halstead Neuropsychological measures 
may be seen in Figure 2. As with the Wechs- 
ler-Bellevue, the Control group performed 
at levels consistently exceeding those of the 
brain damaged groups. Among the latter 
groups, the two static lesion groups per- 
formed at fairly comparable levels, although 
their mean scores generally exceeded those of 
the Acute group. 

On all but two of the eight Halstead vari- 
ables, mean difference scores were significant 
beyond the .001 level when the Control and 
Acute groups were compared. On the two re- 
maining variables, Speech-Sounds Perception 
and Finger Oscillation, the Control and Acute 
groups were differentiated beyond the .01 and 
.05 levels, respectively (see Table 4) ; 

Controls were differentiated from Chronic: 
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TABLE 3 ’ 


t Ratios BASED UPON DIFFERENCES BETWEEN EQUATED 
Parrs ON WECHSLER-BELLEVUE VARIABLES 


K. B. Fitzhugh, L. C. Fitzhugh, and R. M. Reitan 


Control Control Control Acute Chronic Chronic i 
vs. vs. vs. vs. vs. vs. J 
Acute Relatively Chronic Relatively Acute Relatively 
Test Variable Static Static Static yt 
o a : ee 
Full IQ 5.1459 3.41*** ADDe* 1.87 1.58 43 
Verbal IQ 4.20**** 2.32* gge 2.37* 1.31 1.15 
Performance IQ GL ia 3.78°** 4.47**** 1.16 2:21* 39 
Information gs 1.36 4,34**** 1.16 50 3.040" 
Comprehension 3.66*** 1.70 2.95%** 1.55 .66 1.05 pr 
Digit Span 220" 1.06 1.39 1.89 1.69 29 i 
Arithmetic AJER 1.81 1.88 2.91** 2.19 -04 
Similarities 4.28**** 1.73 2.47* 2.33* 96 oo: * gi 
Vocabulary 4.10**** 92 2.53* 2.00 93 1.85 T 
Picture Arrangement 2.86** 2.47* 2.10 .44 1.03 33 | 
Picture Completion 4.67**** 137 1.94 1.57 1.55 .26 
Block Design 6.38**** ggas s183 1.97. 1.54 .29 
Object Assembly 4.36%*** 3500s 2.14* 60 2.03 1,16 
Digit Symbol 5.16% 3.90808 5.179 65 1.22 38 
*p <.05. 
ah Sa 
wert > < O01. 


Statics on all Halstead variables except 
Speech-Sounds Perception; and Controls were 
significantly differentiated from Relatively 
Statics on five of the eight Halstead variables. 
Every brain damaged group was differentiated 
from Controls on the composite measure, Im- 


HALSTEAD NEUROPSYCHOLOGICAL INDICATORS 


55 4 


T SCORES 


RELATIVELY 
STATIC 


À 
“ 
— CHRONIC-STATIC i* 


| 
i 


46 i 
5 i 

a DN x 

43 T [acure ar 

42 Si 

g da . a 


Fic. 2. Graphic presentation of mean T score 
values on Halstead Neuropsychological measures for 
control group and three brain damaged groups. 


level. Such differentiation is consistent WY” 
the findings of Reitan (1959a) on a hetero 
geneous group of brain damaged patients. ‘ 

On the 22 test variables studied the tW 
static groups differed significantly from eat 
other on only one, the Information subtest A 
the Wechsler-Bellevue. This particular diffe 
ence may be considered suggestive of the Fi 
fects of institutionalization upon the Chron” 
Static group. In contrast, the Acute gro% 
performed significantly less well than ong 
both of the static groups on several variab e 
Differentiation occurred at levels excee a 
the .05 level on the Wechsler-Bellevue 70) 
ables of Arithmetic, Similarities, Verbal K 
and Performance IQ. The Halstead Indicato ei 
of Memory and Location variables 
Tactual Performance Test, and the Se o 
Rhythm Test differentiated Acutes from 0? 
or both of the static groups at levels exce? 
ing the .05 level of significance. 


sore i 


DISCUSSION 
As Rosvold (1959) pointed out recently’ 


. . of | 
Studits with respect to the effect of brain damag? iy 
general intelligence, though more rigorous th eat” 
the past, are no more in agreement than were 
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TABLE 4 


Tios BASED UPON DIFFERENCES BETWEEN EQUATED PAIRS ON 


t RA 
HALSTEAD NEUROPSYCHOLOGICAL INDICATORS 
l Control Acute Chronic 
Control vs. Control vs. Chronic yE. 
vs. Relatively vs. Relatively vs. Relatively 
Indicator Acute Static Chronic Static Acute Static 
Category 4.90**** zgi 3.90*** -82 04 ‘87 
TPT Time 6.02**** Sst Soe 92 72 57 
TPT Memory 5.20°°** 1.85 2.45* 2.20* 1.64 62 
TPT Location 5.72 3.04*** 3.96"** 2.72** 2.15* ‘86 
Rhythm 4.60¢*** 1.94 2.59* 3.41" 2.10 1.26 
Speech 3.39*** 1.47 146 1.53 1.58 el 
Finger Oscillation 2.55* 3.27%" Io a m 42 34 
TMpairment Index 6.50 ggi 5.17 1.27 -25 1.83 
ee ee 
t *p <.05. 
fhe $02 
ene <01. 
Pp <.001, 
deterioration, of these conditions, either within or between 


lier studies, some of which claimed 


z thers not (p. 434). 


The basis for disagreement may relate In part 
to differences in the types of brain gies 
Studied. Many studies in this area have use 
Subjects with brain damag 
_ lead injuries (Aita, Armitage, 
Rabinovitz, 1947; Milner, 1956; SOS i 
euber Be Mishkin, 1954; Weins e a ces 
Teuber, 1957). (These and gme ag an 
this section are illustrative rather | i 
exhaustive.) In some instances the patients 


: e- 
ave j intermediate recovery Pf 
" feat in the in al, 1947); in 


Ww Aita et É 

Others ee as and his co-workers) 

` St least several years have elapsed eral lobe 
ead trauma occurred, and in tempori ob- 
SPilepsy (Milner, 1956) brain damsa T in 
ably occurred in most instances at birth or 1 


chil tracings, eV 
dhood, In terms of 5 indicate that 


enc to é 
a has been presented head injuries, 
Ny patients recover !T0 nial results 


dually approaching more no 


(Jasper, Kershman, api erity of brain 

© variation in type and sev in dis 
atiae associated with develope ame 
“(We Processes has been well vi vestiga- 
to Schsler, 1958). Relatively aaa ‘ychologi- 

‘ er have used such patients | a ender, 
` loge ĉluations (Battersby, Krieger, o Mark, 
lon.) Halstead, 1947; Mo diversity 

55; Reitan, 1955). Aam: 


diagnostic categories, is well known to neu- 
rologists, neurological surgeons, and neuro- 
pathologists. It is not beyond the scope of 
reasonable possibility that different diagnostic 
conditions may reflect themselves differently 
in psychological testing. In fact, Reitan 
(1959b) has recently demonstrated that in- 
ferences based on psychological test results 
alone (without reference to anamnestic ma- 
terial or other findings) identify patients with 
head trauma, brain tumors, cerebrovascular 
accidents, and other diagnostic conditions at 
levels far exceeding chance expectancy. The 
dependent variables, or psychological tests, 
have also frequently varied from one study to 
another among different investigators. This 
factor also would contribute a certain amount 
of variance to the conclusions drawn. 
Because of the impossibility of simultane- 
ous manipulation of the many factors that 
are probably relevant, the results of any 
single study in this area must be viewed as 
tentative. The present findings, however, agree 
with certain others in which the same instru- 
ments were used in indicating that the effects 
of brain damage may be measured reliably 
(Fitzhugh, Fitzhugh, & Reitan, in press; 
Kløve, 1959; Kløve & Reitan, 1958; Reitan, 
1955a, 1955b, 1958). Additionally, among 
the brain damaged groups consistent trends 


were observed revealing greater psychological 
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impairment in patients who suffered from 
acute organic brain damage or disease than 
in groups with relatively static organic dam- 
age. The results suggest that the nature of the 
brain lesion at the time of psychological test- 
ing is an important variable and should be 
considered in studies of psychological deficits 
in association with brain damage. 


SUMMARY 


One Control group and three brain dam- 
aged groups, each composed of 16 patients, 
were compared on the Wechsler-Bellevue In- 
telligence Scale variables and eight Halstead 
Neuropsychological indicators in order to in- 
vestigate psychological impairment in relation 
to acuteness of organic brain dysfunction. The 
Control group’s performances consistently ex- 
ceeded the performances of the brain dam- 
aged groups. Also, two static lesion groups 
(one institutionalized) rather consistently per- 
formed at levels superior to the levels of the 
Acute lesion group. The results suggested that 
acuteness of the organic brain lesions is an 
important variable to be considered in studies 
of psychological deficits among brain dam- 
aged subjects. 
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TEMPORAL AND EMOTIONAL FACTORS IN THE 
SELECTIVE RECALL OF DREAMS 
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There is increasing evidence that everyone 
dreams approximately five times every night 
(Aserinsky & Kleitman, 1955; Dement, 1955; 
Dement & Kleitman, 1957; Dement & Wol- 
Pert, 1958), Yet even when there is motiva- 
tion to recall dreams, as in psychotherapy or 
research, some people recall no dreams at all 
(Schonbar, 1959), and none report anywhere 
Near the maximum possible. Why are so many 
dreams lost? What characterizes those which 
are remembered? 

The present study is based upon the ob- 
Servations of the investigators cited above 
that dreams occur intermittently during the 
Whole sleep cycle (except in the first hour), 
that they are associated with lighter phases 
of sleep as indicated by EEG, and that E 
tend to get longer during the night. The 
study grew out of Freud’s theories concern- 
ing the function of dreams and of a theory 
arising from Freud’s. 
meccording to Freud (1949, 

Jor function of the dream 1s : 
Seep of aes In sleep, the ego gives 
UD its cathexes in both the external ad ir 
ernal worlds; the unconscious Or i ge 
Ever, does not sleep, and, because of t a 
axation of the censorship of the somno’ t 
€80, becomes more able to intrude its see 
Upon the individual. Were these wishes Ae 
®Xpressed in undisguised form, they he a 
Create sufficient anxiety to awaken the s oo 

řeud therefore considers the dream to © E 
Economica] compromise, with the piy = 
Pulses allowed expression in disguised se 
«Perienced as objective rather oa aae 
in events, thus not demanding ful 

îP, but allowing sleep to continue. 


is too great, 
ion to ward 


1953, 1957), a 
to preserve the 


Tf : 
= the demand made by the wmncomsclOts 
at the sleeping ego is not in a post 
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it off by the means at its disposal, it abandons the 
wish to sleep and returns to waking life... . every 
dream is an attempt to put aside a disturbance of 
sleep . . . . This attempt can be more or less com- 
pletely successful; it can also fail—in which case the 
sleeper wakes up, apparently aroused by the dream 
itself (Freud, 1949, pp. 56-57). (Quoted by permis- 
sion of Norton) 


Such failures are identified as anxiety-dreams 
(Freud, 1953). 

Related to a part of Freudian theory is 
Gutheil’s view (1951) that the dream serves 
to protect the integrity of the ego. Gutheil 
proposes that dreams are most likely to oc- 
cur just as the individual is falling asleep and 
just as he is awakening, so that the ego is able 
to make gradual adjustments to the differing 
demands of the two states and is not pressed 
into abrupt, and possibly disintegrating, 
changes in function. Gutheil predicts further 
that dreams from the falling asleep period 
are less likely to be remembered than those 
from just before waking because of the long 
period of unconsciousness which intervenes in 
the former case. , 

For the most part, the above views grew 
out of the analysis of retrospectively recalled 
dreams, mostly of patients in psychotherapy 
or of Freud himself. There was no way of 
knowing then that these dreams were merely 
a sample of a much larger and determinable 
number. The present study is concerned with 
testing some propositions based in theory, but 
in terms of selective recall, since this is the 
significant factor in what is available to us 
under nonlaboratory conditions, 

The first two hypotheses to be tested are 
that more dreams are remembered as having 
preceded a waking period than as having 
preceded continued sleep, and that propor- 
tionately more dreams are remembered as oc- 
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TABLE 1 


NUMBER or DREAMS IN 


EacH TEMPORAL AND FEELING CATEGORY 


For Groups H (N = 19) ann L (N = 19) 


Feelings 
Unpleasant 
Time Group Neutral Pleasant Nonanxious Anxious Total 
Indeterminate? H 47 18 10 22 97 115 
L 7 5 3 3 18 
Awoke Dreamer? H 14 4 14 21 53 65 
L 6 1 1 4 12 
Awoke Dreamer in Morning H 1 1 1 3 a 3s B 
L 0 0 0 2 2 
Morning H 21 8 14 14 57 73 
L 12 2 0 2 16 
Total H 83 31 39 60 213 a 
L 25 8 4 11 48 
Combined 108 39 43 Ti 261 


$ Includes 7 dreams identified 


as F (Falling asleep). 
Includes 4 dreams identified 


Hypothesis 4. Of the dreams having feelings 
other than N ascribed to them, the number of 
P and U feelings was tested against a 50-50 
expectancy, For Group H, x? was 37.7, p < 
:0001. For Group L, x? was 2.14, p > 05-10. 
The recalled dreams of frequent recallers are 
not only characterized by having more feel- 
ings than neutral emotional components, but 
also by more unpleasant than pleasant feel- 
ings. The reported dreams of infrequent re- 
callers, on the other hand, are not only more 
emotionally neutral, but, when feelings are re- 
membered, they are not more likely to be un- 
pleasant than pleasant, 

Hypothesis 5. Dreams which awakened the 
Ss (W, MW) were divided into those with 
anxiety and those without; for Group H, this 
distribution was tested against the distribu- 
tion of anxiety and its absence in all other 
dreams. x? was 65.26, significant beyond 
0001. This hypothesis was not tested for 
Group L because the frequencies were too 
small, although they fell in the predicted di- 
rection. It may be concluded that, while anx- 
iety is not experienced in a Majority of Greup 
H’s W and MW dreams, a significantly greate! 
proportion of anxious feelings is associated! 


wi 


as FW (Awoke the dreamer while he was falling asleep). 


+ aie 
th these dreams than with those which d 


not awaken the individual. 


ti 
ti 
i 


its 213 dreams a; 
Group L, 25 of i 


of 


pared with an ex 
the distribution 
nificant at the 

pothesis receive, 
testing of Hyp 
fewer dreams r 


of 


whose dream 


5, 
Hypothesis 6. The prediction that dream 


recalled as having occurred at indetermin® 
imes during the night and followed by C0% 
nuous sleep contained more pleasant fe 


ngs than other dreams was tested in the $42 
manner for Grou 


high enough freq: 
ing. x? 
Again, 
not remembered as 
membered as bein 
ant than all others, 


P H, Group L not avis 
uencies for meaningful K, 
was 7.15, significant at the .003 le¥ è 
although most of these dreams W? 
pleasant, they were aS! 
g significantly more ple 

(i 
Hypothesis 7, Group H designated 83 A 
s being neutral in feeling 
ts 48. When the distribu? 3 
eelings in Group L was ©? of 
‘pected frequency based up, ig 
in Group H, x? was 5.28, A 
-O1 level. In addition, this A 
d inadvertent support in 
othesis 3. People who reng 
emember proportionately ™ yg 
utral in feeling than do pe% — 
recall is greater. 


N and other fi 


them as ne 
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Discussion 


The experimental evaluation of any theory 
is a two-step process. First, it must be de- 
termined whether the events or processes as- 
sumed, implied, or predicted by the theory are 
confirmed by observation. Second, and more 
difficult, is the matter of whether the given 
| theory explains the observations more ade- 
quately or more economically than other al- 
ternatives. 

For example, Freud said, “Dreams are the 
GUARDIANS of sleep . . .” (1953, p. 233). 
They occur, then, when sleep is endangered. 
And investigations into the physiological cor- 
relates of dreaming have demonstrated that 
dreaming does occur during periods of lighter 
Sleep. A necessary factual prerequisite has 
thus been established. But it does not neces- 
Sarily follow that these “new laboratory 
experiments . . . have corroborated Freud’s 
brilliant guess” (Robinson, 1959, p. 52). 
Freud would maintain that unconscious wishes 
have brought about the lighter sleep by striv- 
ing for expression, that the dream puts a 
Stop to this so that sleep may continue. But 
it may also be, of course, that the dream i 
self somehow interferes with the depth o 
sleep, Thus, the discovery of the correlation 
of dreaming and lighter phases of sleep is a 
necessary but not sufficient condition for sup- 
Port of Freud’s theory. e. 

The research reported here is similarly oa 
cerned with establishing observationally the 
Verification of some assumptions or implica- 
tions of Freud’s views. For example, A be 
und that, in general, dreams were we A : 
remembered simply because they Se erti 
Waking period, and that, for those who Ta 
relatively more dreams, the period pene 
Ore normal waking did not produce a 
than its proportionate share of go z 

reams. Not only do these findings E 
SUDport Gutheil’s statements, but they ihat 
Considerable doubt upon any ee Unon 
Team recall is primarily cooper ee h, 5 
factors similar to those studied by i 

US and others: recency, oppor for 
citation greater length, possibly greate ae 

‘onality, ‘and lack of oppõrtimiy ae a 
i We inhibition. Seemingly, more yaa 
lve factors are operating. 


Similarly, if dreams merely repeat events 
of the day before, or are arbitrary representa- 
tions of digestive processes, or responses to 
fortuitous external stimulation, then it should 
not have been found, as it was for Group H, 
that the dreams were accompanied by emo- 
tion more often than not, or that the emotion 
was more often unpleasant than pleasant. But 
these findings are necessary to a theory which 
postulates that the dream represents conflict 
which is important to the dreamer. 

The finding that dreams which awakened 
the sleeper were proportionately more often 
identified as anxiety dreams than were dreams 
followed by further sleep or normal waking 
directly supports one of Freud’s theoretical 
statements. The finding that dreams followed 
by continued sleep contain more pleasant 
feelings than do other dreams is somewhat 
ambiguous. It would seem that these dreams 
might be the most “successful” in Freud’s 
sense—at least of remembered dreams—not 
disturbing sleep, and possibly most disguised 
in the sense of being remembered as enjoy- 
able; one cannot, however, wholly discount 
the likely possibility that the unpleasant as- 
pects of these dreams became the victim of 
further repression during sleep, but there is 
no way of finding out from these data. It 
should be emphasized that even these dreams 
are not characterized as pleasant; it is rather 
that pleasant feelings are more likely to be as- 
sociated with them than with others. 

This study has replicated, in a nonclinical 
situation and with a nonpatient sample, the 
clinical procedure in which people report 
dreams which they remember, thus providing 
material similar to that upon which Freud 
and other psychoanalysts have made their ob- 
servations. The findings of this study, at least 
with people who tend to recall dreams, con- 
firm the validity of the observations upon 
which some aspects of Freud’s dream theory 
were built. 

There was only one prediction concerning 
the relationship between feelings in dreams 
and the greater or lesser tendency to recall 
dreams. This was that recallers of relatively 
few dreams would also remember them as 
being more neutral in feeling than would 
more frequent recallers, and this was con- 
firmed. In addition, dreams recalled by the 
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low recallers occurred disproportionately more 
often from the period preceding morning 
waking and, if feelings were attributed to the 
dreams, they were not more likely to be un- 
pleasant than pleasant. It may be that a 
larger sample of dreams from infrequent re- 
callers might have reversed the latter finding 
but, as it stands, it would seem that people 
who recall few dreams also recall them as be- 
ing fairly bland or less unpleasant than do 
people who recall more frequently. It is pos- 
sible that the less frequent recallers do not 
have so many conflicts so that their dreams 
are, in fact, more bland. But it is at least 
equally possible, and seems more likely, that 
people who repress more of their available ex- 
perience, as in forgetting dreams, reveal this 
repression rather generally by also toning 
down the affect. Previous research ( Schonbar, 
1959; Singer & Schonbar, 1959) has found 
that dream recall is positively related to mani- 
fest anxiety, but, since the latter was meas- 
ured by conscious self-report, the dilemma is 
merely emphasized rather than resolved. A 
similar question arises concerning the finding 
(Singer & Schonbar, 1959) that repression 
(MMPI R scale) and dream recall are nega- 
tively correlated. But there is also evidence 
(Schonbar, 1959) that people who recall no 
dreams also tend not to recall even the process 
of dreaming. It would thus seem that these 
people exhibit a pattern of repression or lack 
of awareness of the presence and nature of 
their own dream processes. A pattern is sug- 
gested, wherein people who tend to be aware 
of their own internal experience remember 
more dreams and more of the affect associ- 
ated with them, while less aware individuals 
remember fewer dreams and blander affect. 
In summary, then, the findings of this study 
support the underpinnings of some aspects of 
Freud’s theory of dreams and fail to support 
Gutheil’s contention. From the more difficult 
point of view of theoretical adequacy, it is 
worth noting that, while Freud attributed the 
memory of anxious (and possibly of unpleas- 
ant) dreams and the existence of dreams 
which disturb sleep to a breakdown or fail- 
ure of ego function, other analytic theorists 
(Fromm, 1951; Hadfield, 1954) would give 
credit for these events to a successful break- 
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through of self-realizing, insight-producing 
forces. But the same kind of substructure of 
intrapsychic conflict is assumed by them as 
by Freud, and the findings of this research, 
therefore, also offer confirmation of their 
views. 


SUMMARY 


Forty-five graduate students in education | 
turned in reports on recalled dreams every 
day for 4 weeks. On these reports were 1 
cluded information concerning the time dur- 
ing the sleep cycle when the dream occurre 1 
and what kinds of feelings were associate! i 
with it. The total group was divided into m | 
above and below the median in dream reca t 
One-tailed chi square tests were used to be 
predictions based primarily upon formulation” 
drawn from Freud’s theory of dreams. It bie | 
found for both groups that dreams precedi” 
a waking period are not better remember? 
than dreams followed by continued sleep, H 
dreams which awaken the sleeper are prop? 
tionately more often associated with anxiety 
than dreams which do not, and that drear 
which are followed by continued sleep are t 
called as proportionately more pleasant tba 
dreams followed by any kind of waking. * 
the frequent recallers, it was also found mi H 
dreams are more often remembered as havin’ 
had emotional components than as havi 
been neutral, that the feelings are more ofte? 
unpleasant than pleasant, and that the P% 
riod just before normal morning waking dor 
not produce more than its proportionate sb# 
of remembered dreams. For the group whi 
was low in recall, the recalled dreams did a | 
contain more emotional than neutral attr 
butes, and feelings were not more unpleas®”} 
than pleasant; more dreams were remembe!® , 
by this group from the period just precedit? 
normal waking than would be expected. 0 
addition, a direct comparison of the a 
groups revealed, as Predicted, that the rei 
recall group had significantly more neut", 
dreams than the high recall group. In an 
eral, it was concluded that the finding’ 
this study support some of the propositi” į 
in Freud’s theory of dreams. The study is ” 
Seen as a crucial test of theory. 


i 
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Since the development of a rather simple 
instrument for assessing manifest anxiety 
(Taylor, 1953), there has been an epidemic 
of psychological studies concerned with the 
role of anxiety in a wide range of experi- 
mental situations. Here, we will not attempt 
to survey this vast literature. We wish merely 
to point out that these studies of anxiety have 
been conducted mainly in laboratory and aca- 
demic research settings, and little use has 
been made of the instrument in clinical or 
“real life” situations. The developers of the 
instrument and many of their followers have 
stated (Taylor, 1956) that they are not really 
concerned with Measuring anxiety, but are in- 
terested in obtaining a Measure of “drive.” 
This concept of drive is viewed within the 
framework of Hullian learning theory. Ac- 


cording to this theory, all habit tendencies 
activated by a given stimulus are considered 
to be multiplied by the total drive then oper- 
ating. Employing the Manifest Anxiety Scale 
(MAS) to provide a measure of drive strength, 
the performances of subjects selected on the 
basis of high or low anxiety scores have been 
compared on such measures as eyelid condi- 
tioning (Hilgard, Jones, & Kaplan, 1951; 
Spence & Farber, 1953; Spence & Taylor, 


1This study was made possible by a research 
grant, B-2356, from the National Institute of Neuro- 
logical Diseases and Blindness, United States Public 
Health Service, awarded to the Brown University In- 
stitute for Research in the Health Sciences, The pres- 
ent report stems from an ancillary study to the Na- 
tional Collaborative Project, conducted locally at the 
Providence Lying-In Hospital, which js investigating 
perinatal factors in child development, We wish to 
express our appreciation to Glidden Brooks, who is 
Director of the Research Institute at Brown Univer- 
sity, for facilitating this study. Also, 
debted to the clinic staff of the Providen 
Hospital for their cooperation and assist: 


we are in- 
ce Lying-In 
ance. 


1951; Taylor, 1951), verbal learning (Lucas, 
1952; Montague, 1953; Taylor & Spence, 
1952), word association (Davids & Eriksen, 
1955), and various other more complex tasks 
(Farber & Spence, 1953; Wesley, 1953; Wes 
trope, 1953). 

There have been some attempts to assess 
the clinical validity of the MAS (Bun 
Wiener, Durkee, & Baer, 1955; Gleser 1 
Ulett, 1952; Hoyt & Magoon, 1954; Kendall, 
1954), and in general it does seem to be a 
sociated with clinical evaluations of anxiety: 
Moreover, Eriksen and Davids (1955) a 
ported finding significant personality differ 
ences between subjects who scored high of 
low on the MAS, and also differences i . 
psychological defense mechanisms. More spe 
cifically, it was found, in a group of male CO 
lege students, that subjects who were high of 
the MAS were also pessimistic in their out 
look on life and were relatively low on ut 
lization of the mechanism of repression 2° 
cording to the evaluation of an experienc® 
psychoanalyst. 

Tt seems, then, that the MAS has demo™ 
strated utility as a research instrument an 
has generated considerable interesting re 
search. However, since most personality the? 
tists place great emphasis on anxiety 45 
motivating factor in life adjustment, and sine: 
it is a well established fact that anxiety pla° 
a crucial role in the formation of psyc t 
pathology, it seems worthwhile to condu, | 


<< 


further research on the clinical utility of t t 
objective instrument for assessing manife? 
anxiety, 

At present, there appears to be increasi"$ 
research interest in the effects of anxiety a 
stress on the Psychological course of PY 
nancy and the influence that emotional t™ 
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moil during pregnancy may have on the sub- 
sequent adjustment of the offspring. In a 
study of physical and mental handicaps fol- 
lowing disturbed pregnancy, Stott (1957) 
suggested that prenatal influences were to 
blame. In studying a group of mentally de- 
fective children, he found that in a large pro- 
portion of the cases there had been marked 
emotional stress during pregnancy, as a re- 
sult of family conflicts and personal unhappi- 
ness. In a recently reported study of the in- 
fluence of prenatal maternal anxiety on emo- 
tionality in rats, Thompson (1957) tested 
and confirmed the hypothesis that “emotional 
trauma undergone by female rats during preg- 
nancy can affect the emotional characteristics 


of the offspring.” 
The plan of the research program from 


Which the present report derives 6 andy 
variety of psychological procedures to pond 
emotional factors in pregnant women. ith 
report, however, is concerned specifically wi A 
findings obtained from the MAS cca ya 
to a group of women during pregnancy heft 
teadministered soon after delivery of the 
children, 


METHOD 


The subjects of this investigation a A the 
nant women who were studied at the a repre- 
tovidence Lying-In Hospital. They a omen who 
Sentative sample of a larger group ap jema con- 
Were studied in the course of @ ply a investi- 
ducted by a team of medical and scientific ject 
Bators who were engaged in a coe ure 
On Perinatal factors in child development., sting dur- 
Were seen for individual psychological tees pio 
i8 the course of a routine visit to the Fl seventh 
Most cases was at approximately omen, 20 
Month of pregnancy. Of the group ° : t approxi- 
mturned for a routine physical checkup $ the other 
ately 6 weeks following childbirth, NE hospital 
8 women failed to return for this sche tice will be 
visit, The 20 patients who were seen ae vere seen 
labeled Group I, and the 28 women who W 
only during pregnancy constitute Group 


investigation, V0- 
A the course of the large scale in ch patient. As 


hous data were gathered for €a se 
o t of the assessment, they were 4 ee. te 
cug hensive battery of psychologica on fies 
ried in this assessment procedure wea | 

Gree? which is the focus of the presen 
ana E I, the MAS was administere E 
adm {lowing pregnancy, while in Gro i 

the Mistered only during pregnancy- ~" bje to clas- 
sify Official hospital records, it was $ 3 
ach patient’s delivery room recor 


Par 


or as indicating some “abnormality or complication.” 
In Group I, there were 13 patients in the normal 
category and 7 patients in the abnormal category. 
In Group II, the subdivisions were 12 normal de- 
liveries and 16 with abnormalities or complications. 

The patients in both groups were of “normal” in- 
telligence. As measured by the Wechsler-Bellevue In- 
telligence Scale, the mean IQ in Group I was 101 
and the mean IQ in Group II was 95. Moreover, in 
both groups the mean age was 25 years and ranged 
from 17 to 40 years. Thus, although no attempt was 
made to match the patients in the two groups, it 
happened that the groups were of very similar age 
and IQ, and in regard to these two variables it seems 
probable that they are representative of pregnant 
women who are being studied at various clinics 
throughout the country. 


RESULTS AND DISCUSSION 


Now let us consider the findings from the 
MAS. In Group I, on the first testing, the 
normal subgroup obtained a mean manifest 
anxiety score of 16.5, which is significantly 
lower (¢ = 2.19, p = .05) than the mean of 
23.5 in the abnormal subgroup. Examination 
of the ranges of the manifest anxiety scores 
in the two subgroups further evidences the 
greater anxiety in the abnormal group, with 
their scores ranging from 14 to 37, as com- 
pared with scores ranging from 8 to 26 in the 
normal group. Thus, both the mean scores 
and the spread of the individual scores re- 
veal the abnormal delivery group to have been 
relatively high on manifest anxiety according 
to their own avowal of feelings and symptoms 
during pregnancy. In analyzing the results 
from the second testing of the patients in 
Group I, it is noteworthy that the level of 
manifest anxiety decreased in both subgroups 
following pregnancy, with a mean of 15 in 
the normal subgroup and a mean of 18.3 in 
the abnormal subgroup. Although the group 
that experienced difficult deliveries continued 
to score higher on manifest anxiety than did 
the group who had normal delivery room ex- 
periences, the nonsignificant difference (£ = 
.70) was not as pronounced as it was when 
the women were in a state of pregnancy. 

The findings in regard to manifest anxiety 
in Group II were remarkably similar to those 
obtained in Group I. In this second group of 
patients, the mean MAS score in the normal 
subgroup was 16, which is significantly lower 
(t = 2.39, p = .05) than the mean score of 
23.6 in the abnormal subgroup. Again, the 
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range of MAS scores from 4 to 30 in the nor- 
mal subgroup was noticeably lower than the 
range from 12 to 38 in the abnormal sub- 
group. Thus, in both samples studied in this 
research, it was found that women who were 
later to experience complications in delivery 
or were to give birth to children with abnor- 
malities tended to report a relatively high 
amount of disturbing anxiety while they were 
pregnant. 

In considering these findings, it should be 
emphasized that at present we have no infor- 
mation regarding the causes or reasons under- 
lying the higher MAS scores in the abnormal 
subgroup. One possibility is that the obste- 
tricians may have anticipated abnormalities 
or complications, and may have’ conveyed 
this information to the patients. However, 
this possibility does not seem too likely, as for 
the majority of these patients the psychologi- 
cal assessment was conducted during their 
first visit to the clinic. That is, these women 
did not have private obstetricians who fol- 
lowed their medical progress throughout the 
pregnancy, but were being seen for their first 
medical examination at a rather late stage of 
their Pregnancy. Future examination of so- 
Ciological, medical, and past history data on 
sani a nae ce ee 
derstanding jn this oer ane ats 
from comparisons of piers = 

of clinic and privat 


tients, One other point that should be 

= Boa time, however, is that there was no 

i e ee the two subgroups in re- 

gard to the number of patients fo i 
r 

was the first delivery. n thi 


The mean 
í { number of 
previous pregnancies and Previous deliveries 


was practically identical in th, 
abnormal subgroups. S pornal eee 


It is also interesting to note 
MAS scores of about 16, Pre terre sen = 
mal subgroups both during and after Se 
nancy, are very similar to the mean MAS 
scores obtained previously in relatively large 
samples of female college undergraduates 
(Smith, Powell, & Ross, 1955; Taylor 1953) 
The present findings suggest, therefore that, 
as a group, pregnant women who will later 
experience normal childbirth do not differ 
from normal nonpregnant college females in 
the avowal of manifest anxiety, but pregnant 


er un- 
come 
e pa- 
made 
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women who are likely to experience child- 
birth abnormalities later are significantly 
higher on manifest anxiety than are other 
groups of pregnant and nonpregnant women. 

The results of this preliminary study, which 
should be regarded as tentative and in need 
of further independent confirmation, are quite 
encouraging. In addition to demonstrating the 
utility of the MAS in this clinical setting, the 
positive findings obtained with this objective 
instrument suggest that even more fruitful re- 
sults may be obtained through use of projec- 
tive techniques designed to uncover indices of 
emotional factors operating at deeper levels in 
the personality. It is hoped that the intensive 
program of investigation we have embarked 
upon will eventually lead to greater psycho- 
logical understanding of complex relations be- 
tween maternal psychodynamics during preg- 
nancy and the process of child development- 


SUMMARY 


The purpose of this research was to com- 
pare measures of manifest anxiety obtained 
during pregnancy and following childbirth, 
and to relate these anxiety measures to de- 
livery room experiences. In two independent 
samples of clinic patients, women who wer 
later to experience complications in the de- 
livery room or were to give birth to chil- 
dren with abnormalities obtained significantly 
higher manifest scores during pregnancy than 
did women who later had “normal” delivery 
room records. The results obtained from re 
testing one of the samples shortly after child- 
birth showed decreased levels of manifest an% 
iety both in patients who had undergone not 
mal childbirth and those who had experience 
complications or abnormalities. Manifest an% 
lety scores were still relatively higher in th* 
latter subgroup, but the difference was "° 
longer significant. It was concluded that thes? 
findings demonstrate the clinical utility ° 
pon rsa Anxiety Scale, and also sugges 
ture ce te a projective methods m m 
cal understandins es to ergata psycholog 
factors in n, at te role of emotion 

: Snancy and childbirth. 
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REPEAT STUDY WITH A PROJECTIVE FILM 
f FOR CHILDREN 


MARY R. HAWORTH? 
Michigan State University 


Rock-A-Bye, Baby (Haworth, 1960; Ha- 
worth & Woltmann, 1959) is a projective 
puppet film which can be shown to groups of 
children. The film story focuses on a little 
boy, Casper, and his jealousy of his baby 
sister. When left to baby-sit, he begs the 
witch to help him get rid of the baby. She 
puts a spell on the milk; mother returns and 
rushes the baby to the hospital. Casper is 
filled with remorse, recalls the witch, and 
finally kills her. Thus the spell is broken, the 
baby’s health is restored, Casper’s guilt is re- 
solved, and his parents reassure him of their 
love by a gift of strawberry ice cream. Wolt- 
mann (1951) gives the complete script of the 
play, as well as the rationale for the use of 
Puppets in projective devices for children. 

__ The film is shown to entire classes, divided 
into groups of 10 to 15 children per showing. 


Responses are first secured halfway through 
the showing whe: 


e shi n the film is stopped and each 
child in the group is invited to finish the story. 
After the rest of the film is. shown, each child 
is asked, individually, a Standard set of ques- 
tions in terms of Casper: what he thought of 
his parents and of the witch, how he felt when 
the baby got sick, whether he should be pun- 
ished for what he did, what he should tell his 
mother, and how he felt when the baby got 
well. The child is also asked what part he, 


himself, liked best and which character he 
would like to be. 


The film was originally administered to 244 
children, from nursery school through fifth 
grade, as reported by Haworth (1957). A 
scoring scheme (Haworth & Woltmann, 1959) 


1The author is indebted to the Principals and 
teachers who cooperated in the Project, and to Mary 
Grummon, Ruth Karslake, and James Mathie who 


assisted in interviewing the children and scoring the 
protocols. 
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was devised based on patterns of deviant E 
sponses * given to the standard questions an 
in the group discussion during the showing. 
The following indices emerged as represent- 
ing dimensions of personality that appear to 
be tapped by this particular film: Identifica- 
tion, Jealousy (sibling rivalry), Aggression 
toward Parents, Guilt (masturbatory), AM 
iety (castration), and Obsessive Trends. 
The film has subsequently been shown t° 
a new sample of 257 children (kindergarten: 
first, and second grades) in order to ascertal? 
whether similar proportions of children woul 
score high on the various indices, and whether 
the developmental progressions which ap 
peared to be demonstrated in the earlier study 
would be substantiated in the second samp 4 
A cross-validation analysis was planned fo 
the two indices (Guilt and Jealousy) for whic 
criterion groups can be selected from t 
samples. One further aspect of the prese” 
study is concerned with scoring reliability- 


THE SAMPLES 

Table 1 shows the distribution of children, by 
grades, in the first sample (A, Pennsylvania) an 
the second sample (B, Michigan) 4 the 

heavily weighted toward on 
since 95 of the 244 chil er- 
ool areas serving predominantly uia 
nd professional and managerial grouk 
8 149 children were drawn from a S™ 


sity faculty a 
The remainin, 
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TABLE 1 
DISTRIBUTION OF SAMPLES BY GRADES 


Nursery Kinder- 


Sample School garten 1 2 3 4 5 Total 
a 40 — 12 —45 — 47 M 
B -= s «124 47 — — — 257 


tas community representing all occupational 
aes In Sample B, half (128) were attending a 
chool which serves the entire range of occupational 
levels, while 129 children came from a marginal dis- 
trict of predominantly lower class families. 

As the original study was particularly concerned 
With the responses of the large first grade group, ap- 
Proximately the same number of first graders were 
Secured in the second sample for comparative pur- 
Rose: The groups will hereafter be referred to by 
pled and letter, e.g, 1-A indicates the first grade 

Sample A; K-B, kindergarten of Sample B. 


RESULTS 


Index Scores and Developmental Progressions 


A comparison of high scores and deviant 


identification choices * made by the two first 

8rade samples revealed only one substantial 
ifference: significantly more children of 1-A 

aoe deviant identification choices (= 
805; p = .051). The specific item that ac- 

Counted for most of this difference was identi- 
cation with the opposite-sex parent, with 

z 'S choice being made more often by 1-A 
an by 1-B children. 

Gin 1 demonstrates th 

and ee between the t 

Pari Includes the fifth grade 

Parison of the incidence of high scores 0” 

eX at different ages. 

Sion €velopmental progression 
uil are shown in Figures 2 ani se 
~t and Anxiety show congruent tee 

hi ‘ure 2) with a fairly large pene 

Scores in the early grades and a decide 
Occurring between second and third 
5 


cage abulations were made of only the five ye 
Buish choices which are always deviant, as | Lpa 
devi ed from another category of choices whic! ao 
the nt at certain ages or for a specific Sex. Eaa He 
er indices requires a specified number of 1 


e otherwise close 
wo first grades, 


curve for com- 
each 


s for each dimen- 
d 3. Aggression, 


Sho, n - 

ents e for a high score, except Aggression te at 

We, “Or pur resent study, $ 
en poses of the p. considered 2 high 


One « Se 
Score ne “aggressive” response 1S 


JEALOUSY 
AGGRESSION 
TO PARENTS 


PER CENT 


INDEX 


Fic. 1. Percentages of two first grades (1-A, 1-B) 
and fifth grade (5-A) scoring high on each index. 


grades. Figure 3 shows the three indices which 
maintain fairly constant levels in the later 
years. Jealousy and Identification start high 
and remain stabilized at the second grade 
level, while the Obsessive trends show a con- 
stant and much lower level throughout the 


age range under study. 


Cross-Validation of the Guilt Index 


The Guilt Index was originally derived 
(Haworth, 1957) from patterns of deviant 
responses given by 10 of the 12 children in 
the 1-A group who had been observed to en- 
gage in autoerotic practices (masturbation or 
thumb sucking) either during the film show- 
ing or the inquiry period. Similar response 
patterns were given by only 2 of the 100 
“nonautoerotic” children.© The seven items in 


6 The Guilt Index did not prove to be applicable 
to the third and fifth grades. Only 2 (out of the 
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Fic. 2. Percentages of high scores on Aggression to 
Parents, Guilt, and Anxiety Indices from kinder- 
garten through fifth grade. (Scores are averaged for 
two first grades.) 


the index relate to: (a) mother not knowing 
what went on, (b) Casper being sent to bed 
as punishment, (c) Casper feeling rejected by 
father, (d) Casper being very ashamed or re- 
solving to make amends, and references to 
(e) the baby stinking, (f) Casper or the baby 
being in the water, (g) the kissing scenes. 
Because of the guilt tinged aspect of most of 
these responses, it would appear that children 
who respond with high scores (i.e., at least 
two of the seven items) may be those who not 
only engage in autoerotic acts but who also 
feel guilty for so doing. If such responses were 
also given by the “autoerotic” children in the 
new sample, considerable validity would be 
demonstrated for this index. 

No statistical test was performed on the 
1-A group since this was an ad hoc approach. 
In the present study, it was predicted that 
more autoerotic (AE) than nonautoerotic 
(non-AE) children would score high on the 
index. 


combined total of 92) children received high scores, 
and neither of these was one of the four observed 
“autoerotic” cases in the two grades. 


Mary R. Haworth 


Table 2 shows the distribution of high 
scores (two or more items) as contrasted t0 
low scores (one guilt item or none at all). 
The predictions were upheld, with signifi- 
cantly more AE than non-AE children 1 
ceiving high guilt scores; this difference was 
especially marked in kindergarten and first 
grade. 

The original criterion (Haworth, 1957) f0 
inclusion of a response as an item in the m 
dex stated that its incidence in the AE grouP 
(N = 12) must be at least one-third of 18 
total incidence for all 112 children of the 17 
sample. Actually, for five of the seven ite™® 
at least one-half of the responses came from 
the small AE group. Table 3 shows the dis 
tribution of guilt responses in the three gra 
of Sample B to be quite similar to that of 
original 1-A sample from which the index W 
drawn. It can be seen that, irrespective © | 
high scores, the AE children make more 
of the guilt items than do non-AE childrel 
so that the original criterion was still met 
all but two instances. (These involved neg 
No. 5 which was given only once in K-B an 
in 1-B, and by non-AE children in 
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averaged for two first grades.) 
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Cases.) The criterion was exceeded by five 
items in K-B, four items in 1-B, and three 
Items in 2-B. 


Cross-Validation of the Jealousy Index 


. In the original study (Haworth, 1957) the 
items of the Jealousy Index were selected on 
an a priori basis from responses of the 1-A 
group which appeared to indicate sibling ri- 
valry. The 11 items of this index include: (a) 
One response of Casper being jealous of at- 
tention given to the baby; (4) a minimum 
of two uses by the subject of slips of tongue, 
€vasions or personal references; aggression 
against the baby expressed openly (c) while 
the film was being shown, (d) in the half- 
Show discussion, or (e-j) in answer to any of 
six specified inquiry questions. For boys, (4) 
choosing to be the baby is an additional item. 
A high score consists of any three of the above 
responses, 

The 1-A sample was divided 
groups on the basis of sibling status, 
Oldest +- middle children in one group; and 
youngest + “only” children in the other group: 
Significantly more of the former group scored 

igh on the Jealousy Index, and the difference 
Was largely due to the boys’ responses in each 
Stouping. Within the oldest + middle group- 
tng, significantly more oldest than middle 
children received high scores. 
similar analysis of the 1-B sample (as 

Well as K-B and 2-B) revealed no differences 
€tween the various groupings: oldest + mid- 
le vs, youngest + only, oldest + middle boys 
VS. youngest + only boys, OF oldest vs. mid- 
‘lle children, The total incidence of high jeal- 
Susy scores was also less for 1-B (16.996) 
than for 1-A (23.2%), but this difference was 


into two 
with 
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TABLE 2 


INCIDENCE OF HıGH AND Low GUILT SCORES IN AUTO- 
EROTIC AND NONAUTOEROTIC Groups (SAMPLE B) 


Guilt Scores 


Grades N High Low x? ? 
K-B 
Autoerotic 28 it 17 
11.61% <.001 
Nonautoerotic 58 4 45 
1-B 
Autoerotic 32 413 19 
16.76"  <.001 
Nonautoerotic 92 7 85 
2-B 
Autoerotic 16 5 if 
.013Þ 


Nonautoerotic 31 1 30 


a Corrected for continuity. 
b Fisher exact probability test. 


not significant. There were some indications 
that differences in family size, ordinal posi- 
tion, or socioeconomic status might be respon- 
sible for the lack of replication in Sample B. 
Much larger samples would be required to 
secure enough high scoring cases for an analy- 
sis of these multiple variables. 


Reliability 

Three judges scored a group of 24 protocols 
pulled at random from the B sample. Inter- 
scorer reliability was computed for the four 
main indices: Jealousy, Guilt, Anxiety, and 
Obsessive Trends. (No reliability study seems 
necessary for Identification choice since quite 
objective criteria can be applied; Aggression 


TABLE 3 
sce OF GUILT INDEX ITEMS IN AUTOEROTIC (AE) AND 
INCIDENC NoNAUTOEROTIC (xon-AE) Groups 
ee i m 1-B 2-B 
1-4 
: AE non- AE non-AE AE non-AE 
AE non AE (vas) (N (N =32) (N =92) (N =16) (N =31) 
Guilt Items (N =12) (N= 
i 5 4 1 1 0 
2. Mother didn't know 4 10 11 8 23 i 0 
3. R slee; t 0 
3 Rather Resis Casper 4 A 7 10 0 o ô 
E ipiri 4 
6 tubo 3 2 4 9 0 0 0 
$ er or baby in water 2 2 1 s g 0 0 


Ssing 
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to Parents is used qualitatively and has not 
been set up in terms of number of responses 
necessary for a “high” score.) The number 
of checks given to each item of each index 
was compared for each of the three pairs of 
judges. The Rulon formula, with the Spear- 
man-Brown correction, yielded the following 
reliability coefficients for each index: 


Jealousy: .95, .93, .95; average = .94 
Guilt: .78, .88, 82; average = .83 
Anxiety: .92, .91, .92; average = .92 
Obsessive: .94, .87, .92; average = .91 


While the overall reliability of scorers is 
quite satisfactory, the lower agreements on 
the Guilt Index were examined for possible 
causes. It was found that most of the dis- 
crepancies between judges occurred as the re- 
sult of neglecting to check, under the item 
“Father rejects Casper,” those responses in 
which the father is mentioned specifically as 
the punisher. The directions have subse- 
quently been clarified to call attention to this 
objective point. 


Discussion 


A replication of the film test has revealed 
no appreciable differences between the two 
large first grade samples, except in the area 
of deviant identification choices, and in the 
sibling status (but not the incidence) of 
children responding on the Jealousy Index. 
The repeat study has also confirmed the ear- 
lier impression that developmental progres- 
sions occur in some areas while plateaus are 
maintained in other dimensions. The fact that 
similar and congruent results were obtained 
between two samples differing in location and 
socioeconomic composition demonstrates a 
certain amount of construct validity for the 
test. To put it differently, if marked differ- 
ences and discrepancies had been found, then 
very little confidence could be put in this in- 
strument as a method of personality assess- 
ment. 

As was expected on the basis of original 
findings, the younger children express more 
outspoken aggression toward parents than do 
older children, and they also score higher on 
measures of guilt and anxiety. There appears 
to be no reason to abandon the earlier hy- 
pothesis (Haworth, 1957) that this film does 


pinpoint certain problem areas related to the 
oedipal period. In view of the decided drop 1 
incidence of guilt and anxiety between the 
ages of 7 and 8 (by which time the latency 
period is presumed to be well underway), it 
still seems, as originally suggested, that the 
guilt measured by the film test is associated 
with masturbation and other autoerotic acts. b 
(The postulated relationship between anxiety 
and castration fears is currently being studie 
via other projective techniques.) It can only 
be speculated whether the slight trend UP” 
ward of the obsessive scores between the se 
ond and fifth grades may indicate an increas 
ing incidence at still later ages. Feniché 
(1945) sees an increase in obsessive reaction? 
and compulsive rituals during the latency Po 
riod as defenses become strengthened agains 
the instinctual impulses. The curves in F8 
ures 2 and 3 may possibly be a graphic te?” | 
resentation of the repression of erotic drives 
and the development of defense mechanisms 
If identification patterns are laid down dur 
ing the oedipal period, the incidence of dev 
ant identifications should remain at faith 
stable levels throughout the age range studie: 
This was found to be the case. On the bash 
of the film responses it would appear the 
jealous reactions, once established, also | 
not decline in the early latency period. 
threat to the ego is undoubtedly not as 8" 
in this area as in those more closely li” | 
to the oepidal situation, Consequently th 
would be less need to repress or defend. 
some instances jealousy toward siblings mig 
even be serving as a substitute outlet for d 
acceptable feelings originally directed towa" 
the parents. e 
The one significant difference between ti 
two first grade samples—namely, devi? 
identification—may possibly be attributa” 
to differences in the socioeconomic statu st 
the two groups. The item responsible for me 
2 this difference was the choice of the oe 
herd Parent by more children fro™ ist’ 
gher status group. Thi ing is co?” ad 
ent with that E ate eei ‘who show, 
sex-role identification to be more clear ja" 
fined, and at an earlier age, for lower © 
children, ý př 
With respect to the Guilt Index, a8 on) 
been previously pointed out (Haworth, 7 


è 


t 
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it is not to be expected that all autoerotic 
children would feel guilty. Nevertheless, the 
fact that a repeat study still shows large pro- 
Portions of them giving a specific cluster of 
responses suggests that dynamic factors are 
being tapped by the index, namely, conflicts 
between instinctual drives and conformity to 
Parental standards. The validity of the Guilt 
Index for children from kindergarten through 
second grade has also been demonstrated by 
the consistently significant differences be- 
tween the number of high scores in the auto- 
erotic, as contrasted to the nonautoerotic, 
Sroups. 

In spite of consistent findings on the Jeal- 
Ousy Index with respect to the frequency of 
children receiving high scores, the sex and 
sib-status distribution of the scores was not 
upheld in the second sample. It appears that 

igh scores may be measuring attitudes to 
either older or younger siblings. In view of 
the equivocal findings, caution should be exer- 
Cised in the interpretation of this index, espe- 
cially if it is the only high score in a protocol. 
In Combination with high scores on other in- 
dices, it may provide useful supplementary 


data for diagnostic purposes. 
SUMMARY AND CONCLUSIONS 


The projective puppet film, Rock-A-Bye, 
Baby, was originally shown to 244 poe 
ftom nursery school through fifth grade. ie 
m has subsequently been shown to 257 č 4 - 
ten from kindergarten through second gra d 
e two large first grade samples ee 
lose correspondence with respect to incidene ; 
deviant scores on all measures ow 
Identification. The consistent developmenta 


Progressions from grade to grade, within and 


between samples, demonstrate construct va- 
lidity for the instrument. 

Two indices could be cross-validated by 
means of criterion groups within each sam- 
ple. The Guilt Index showed the predicted 
significant differences between autoerotic and 
nonautoerotic groups in all three grades of 
the new sample. Differences between sibling 
groupings were not upheld on the Jealousy 
Index. 

Since adequate interscorer reliability has 
been demonstrated for the instrument, and 
generally consistent kinds of data have been 
secured in a replicated study, confidence can 
be placed in this technique as a group screen- 
ing device in the personality assessment of 
early latency children. 
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Standal and van der Veen (1957) have re- 
cently suggested that number of interviews 
constitutes an important variable for study in 
research on psychotherapy. Their argument 
lay especially in the demonstration that of 
several measures of progress in therapy, based 
upon counselor judgments, a measure of 
change in personal integration of the client 
was not only the most important clinically 
and theoretically, but also showed the highest 
linear correlation with log number of inter- 
views. 

If it were substantiated that a very high 
correlation between length of therapy and 
change in personal integration exists, it would 
be an important finding, indeed; for it might 
be possible to employ number of interviews as 
a dependent variable of exceptional reliability 
which nevertheless has critical implications 
for personality change. There can be no doubt 
that the finding of dependent variables which 
have both high reliability and high validity 
and also relevant clinical implications is 
among the most important tasks of psycho- 
therapy research today. 

Therefore, it seems important to confirm 
this earlier finding and to attempt to deter- 
mine the most valid procedure, among pos- 
sible alternatives, for obtaining a measure of 
personal integration derived from counselor 
judgments. To illustrate the alternatives: we 
note that the counselors who made judgments 
of change in personal integration for the 
Standal and van der Veen (1957) study did 


1 The study was supported by funds from the Ford 
Foundation Psychotherapy Research Fund, granted 
to the Counseling Center, University of Chicago. 

2 D. S. Cartwright now at the University of Colo- 
rado; W. L. Kirtner now at the California Institute 
of Technology. 


so at the end of their series of contacts wid 
clients. In this procedure the counselors We 
asked to rate the integration of each client H 
termination and at the same time to scan the! 
long-term memories for a rating of the init 
level of integration. A nine-point scale ra 
ing from “highly disorganized or defensiv? 
organized” (1) to “optimally integrated em 
was used and change in integration Was 
fined as the arithmetic difference betwee 
initial and terminal scores. ip 
If we consider the counselor’s thoughts i 
making such ratings it seems reasonable 
suspect a certain bias resulting from sa 
lapse of time involved. The longer the ‘oh 
of acquaintance experienced by the couns® ef 
the greater his tendency to underestimate ©, 
initial level of integration. For supP%® pe 
counselor were really just guessing about ip? 
level of integration of his client a long "ine? 


. og LiF 
ago, his thought might well go something "a 
this: “The client has been with me & o 


long time so he must have been in rather P 
shape to begin with.” iol 

An alternative procedure would be to a t 
nate this sort of possible bias by obtain 
judgments of the level of integration act inf 
perceived by the counselor at both begin | 
and end points. Since counselor judgme? E 
not only the most frequently employe a0 
terion measures of progress, but also the nel 
available, the present study was under? poh 
to compare the two rating procedures nf 
applied to data comparable to that of St? y% 
and van der Veen (1957). In addition 4 i 
hoped to tease out some of the differe”? d 
any, between a measure of therapy air 
based upon number of interviews 25 2? 
one based solely on number of weeks- 


Length of Therapy and Personal Integration 85 


SuBJECTS AND PROCEDURE 


From a single large research block of cases at the 
Counseling Center, University of Chicago, 87 clients 
had terminated therapy at the time the present tidy 
Was undertaken. (Omitted were 6 cases which had 
started in the block but were still in therapy.) All 
clients had been seen during the period 1956-39 by 
client centered therapists. These therapists included 
both males and females, and their experience levels 
Tanged from 1 to 12 years. There were 52 male 
Clients and 35 females. Also, 52 of the clients were 
students, 35 were not, The mean age of the clients 
Was 28.5 years (SD =7.6), and they had been in 
therapy for a mean number of 29.5 interviews (SD 
= 28.1), and for a mean number of 31.9 weeks (SD 
= 22.5). 
gate measures of length of therapy 

€ exact number of interviews and 
number of weeks. Counselors were asked to make 
ratings on their clients immediately after the first 
Interview and also immediately after the last inter- 
View. In the large majority of cases, ratings were 
bee between 1 and 3 days after the relevant in- 
terview. The number of weeks was computed from 
the number of days lapsing between jnitial and final 

ates of ratings. 

Two measures of integration movement were = 

he first measure was taken only at the end 0 

crapy, thus involving the counselor's long-term 
Memory, The counselor was asked to answer two 
questions, The first was: “What change has there 
Hons in the client’s feelings toward himself? ponr 
aeSponse alternatives were provided, ranging n 
gee discontented” through “much = ase 

ted.” Scores of 1 through 4, respectively, i 
ehagned. The second question was: “How muc 
prange in the client as a person has occurred aye 
vi Started counseling?” Four response alterant 
“che Provided, ranging from d 
spectively a good ealt cae L 

» Were assigned. 
antes two questions “constituted the first reo o 
gration change. It will be called the post! 


Stimate ni ion (PECI). 
of ntegration ( 

change in integ tion change was a 

i one made after 

ence score between two ratings, the final inter- 


only the coun- 
ved. Ratings 


were taken: 
the rounded 


differ 


Weri y 
extre made on a 10-point scale, 
mene a maladjustment” through ae Fe 
of (fully functioning, optimal matni ah 
Scor Was assigned to the most, mala i pai 
this Of 10 to the optimal adjustment en a. his 
esti Scale, the counselor was asked to 1 Stat 
me Mate of the client’s present psychologic tial aa 
on, he score for his estimate after the m a 
‘ew was subtracted from the ee ni 
eas after the final interview tO yiel the val be 
Ure of change. This second meas eration 


tall, A in int 
mete difference measure of change m 


“optimal adjust- 


tery; 


In addition to the above measures, the counselor’s 
nine-point rating of success of the therapy was taken 
for this research. The scale has been used for many 
years at the Counseling Center, and was employed 
also by Standal and van der Veen (1957). The score 
of 9 means marked success. 

The reliability and validity of the measures PECI 
and DMCI are not known independently of the pres- 
ent study. However, it will be shown in the results 
below that both have strong correlations with the 
success rating and with each other. The success rat- 
ing scale has previously been shown to have sub- 
stantial reliability and validity (Cartwright, 1955). 


RESULTS 


The comparability between the samples 
studied by Standal and van der Veen (1957) 
and the present writers is very good. Notably, 
the mean length of therapy was 30.7 inter- 
views (SD = 32.5) in the former study, 29.5 
(SD = 28.1) in the present study. Both sam- 
ples have a slightly greater proportion of 
male than female clients, and of student than 
community clients. For both studies male and 
female therapists were employed. 

The data basic to replicating the major re- 
sults of Standal and van der Veen (1957, p. 
9) are included in Table 1, which shows in- 
tercorrelations of all the measures taken in 
the present study. 

First, both PECI and DMCI correlate posi- 
tively and significantly (p< .001 and p< 
01, respectively) with log number of inter- 
views. Thus, the first major conclusion of 
Standal and van der Veen (1957), that 
“Change in level of personal integration . . . 
has a moderate linear relationship with log 
case length” (p. 9), is supported. 


TABLE 1 


INTERCORRELATIONS OF Two MEASURES or LENGTH OF 
THERAPY, TWO MEASURES OF CHANGE IN PERSONAL 
INTEGRATION, AND A RATING OF SUCCESS 


(N = 87) 
Log Log 
Number Number 
of Inter- 
Weeks views PECI DMCI 
Log Number 3 
Interviews 85 
PECI .22 36 
DMCI 10 29 64 
.29 49 72 -68 


Success 


Note.—For r =.35, p <.001; r =.28, p <.01; r =.21, p <.05. 
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Second, the success rating correlates posi- 
tively and significantly (p < .001) with log 
number of interviews. This finding accords 
with that of Standal and van der Veen (1957, 
p. 6), but the relative sizes of the correlations 
for success rating and for change in personal 
integration with log number of interviews 
differ in the two studies. Whereas Standal and 
van der Veen found the Pearson correlation 
for change in personal integration to be .58, 
and that for success rating to be .37, in the 
present study the order is reversed for both 
measures of change in personal integration as 
compared with success rating. Thus, the sec- 
ond major conclusion of Standal and van der 
Veen (1957), that “Change in level of per- 
sonal integration is more highly related to 
case length than change or outcome on other 
important case variables” (p. 9), is not sup- 
ported. This finding also lends no support to 
their fourth major conclusion, that “With re- 
spect to actual amount of therapy, change in 
personal integration may be more important 
than rated success or other case variables” 
(p. 9). At this time, one can say only that 
length of therapy is positively related to sev- 
eral measures of outcome or change. 

It should be noted that the above results 
hold for both a measure of change in personal 
integration which relies to some extent on the 
counselor’s long-term memory (PECI) and a 
measure of change which does not rely on 
long-term memory (DMCI). Examination of 
the correlations between log number of weeks 
and the three case variables in Table 1 shows 
that the two measures taken only at post- 
therapy (PECI and the success rating) have 
significant positive correlations, while the dif- 
ference measure which does not rely on long- 
term memory has a nonsignificant correlation. 
Since it makes little sense to partial out num- 
ber of weeks from number of interviews the 
evidence in Table 1 must be taken as it stands 
to suggest that sheer length of acquaintance 
does have some influence on the counselor’s 
ratings when these ratings involve his use of 
long-term memory. 

The question arises whether it is possible to 
show somewhat more conclusively the postu- 
lated effects of long-term memory on the 
counselor ratings of change made at the end 
of therapy. The first thing that may be noted 


from the reliability data presented by Standal 
and van der Veen (1957, p. 5) is that the 
rate-rerate reliability coefficient for personal 
integration at the beginning of therapy, aS 
rated at the end of therapy, was .50 (not sig- 
nificant) ; while the comparable coefficient for 
the termination of therapy was .68 (signifi- 
cant at the .05 level). It is also noteworthy 
that in their discussion of reliability they 1e- 
ported that 34 months later, certain coun- 
selors could not remember well enough t0 
make reratings on certain items. The present 
concern is whether the counselors at the timè 
of their first rating could remember enous 
about the beginning of therapy to make vali 
ratings. It was suggested above that with long 
cases, the counselors might have been suffi- 
ciently hazy in their long-term memory to be 
rating essentially on a guessing basis with 2 
bias toward underestimating the level of int 
gration shown by clients at the beginning ° 
therapy. To examine this issue, the original 
data for the 72 clients reported on by Stan 
and van der Veen (1957) in regard to rating? 
of personal integration were re-examined alone 
with the ratings on DMCI for the prese” 
sample. These authors report a Pearson cor 
relation of .67 between the success rating 4” 
change on personal integration. Table 1 sho 
the Pearson correlation of success rating 
DMCI to be .68. 

The two scales are highly comparable. They 
have closely similar wording. The first Þ45 
steps, the second has 10. Further, it was f0% 
that the variances were not significantly a 
ferent. Inspection of the distributions a" 
the wording for the bottom point suggest? 
that the scales could be considered essential! 
equivalent if the unused bottom step of ‘ 
10-point scale was dropped and the ot” 
steps renumbered accordingly. 5 

Table 2 summarizes the comparisons y 
tween the ratings for the two studies when 
the scale used in the present study is tre@ 
as a nine-point scale. 

Table 2 indicates that for the two samples 
the difference between the posttherapy rating 
is not significant while the difference betwee 
the pretherapy ratings is highly signific 
(Even if the latter difference is reduce 


31, the amount of t 
the ¢ 


å cbr 
0 he posttherapy differe” gr 
-value is still very high—3.85.) THUS 


Length of Therapy and Personal Integration 87 


TABLE 2 


Comparison OF MEAN RATINGS OF PERSONAL 
INTEGRATION FOR Two STUDIES 


Period 
Study Rated N M SD t 
Standal & 
van der Veen Beginning 72 31 14 
sar 
Present Study Beginning 87 43 1 4 
Standal & 
van der Veen End 72 SA 17 
1,22" 
Present Study End 87 5.7 14 
* p <.20 
** p <.001 


Comparable samples, there is no difference for 
the posttherapy ratings which were based on 
short-term memory. For the ratings of inte- 
gration at the beginning of therapy however, 
the mean rating of integration is significantly 
lower for the sample studied by Standal and 
van der Veen (1957). This result is in accord 
with the expectation that counselors relying 
on long-term memory would tend to under- 
estimate the degree of personal integration 
shown by their clients at the beginning of 
therapy. 


Discussion 


The absolute size of the correlations ob- 
tained in this study between log number of 
Interviews and measures of change in pêr- 
Sonal integration was not very great, even 
though the latter judgments may be influ- 
enced by knowledge of the former. While it 
does not seem likely that number of inter- 
views can be used as a clinically meaningful 
dependent variable on its own, it does bear 
Useful relations to a number of important 
measures of change taken from counselor 
judgments, and these relations appear to be 
quite stable over the two samples studied. It 
is clear from the present findings, however; 

at considerable caution must be exercised 
when employing counselor judgments to ob- 
tain such estimates or estimates of any varl- 
able. In particular it appears important to 
Pay careful attention to the conditions under 
Which counselor judgments are obtained, es- 
Pecially in regard to the time span over which 


they are called upon to exercise their memo- 
ries. 

The question of whether log number of 
weeks or log number of interviews is the better 
measure of length of therapy cannot be given 
a general answer from the present study. So 
far as the evidence does go, it appears that log 
number of interviews shows the higher cor- 
relations with measures of change in personal 
integration and success of therapy when these 
are taken from counselor judgments. How- 
ever, it also seems that a spurious length-of- 
acquaintance factor may be contributing to 
those higher correlations when the measures 
are taken from counselor judgments made 
only at the termination of therapy. All in all, 
the best procedure at the present time would 
seem to be offered by the use of log number 
of interviews in conjunction with judgments 
made both at the beginning and at the termi- 
nation of therapy. 


SUMMARY 


Data for 87 clients seen by client centered 
counselors were examined in order to replicate 
certain analyses made by Standal and van 
der Veen (1957) on a similar sample. It was 
confirmed that counselor rating of movement 
on personal integration bears a linear rela- 
tionship to log number of interviews. In con- 
trast to the earlier results, the present study 
found that the counselor success rating had 
a higher correlation with length of therapy 
than did rated movement on personal inte- 
gration. An alternative measure of length of 
therapy was also employed in the present 
study, namely log number of weeks. Correla- 
tions with movement and success were uni- 


formly smaller for this alternative measure of 


length. 
Memory factors influencing the counselors’ 


judgments were examined by use of two meas- 
ures of change in personal integration, one 
calling upon the counselor to rate change at 
the end of therapy only, the other calling 
upon him to rate the level of integration he 
sees in the client after the first interview and 
after the final interview, change being calcu- 
lated from the difference between the initial 
and final ratings. The results showed that, 
when counselors’ ratings of change are made 
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only after termination of therapy, they are 
influenced by the sheer length of acquaint- 
ance with the client. It was also hypothesized 
that in the Standal and van der Veen (1957) 
study, counselors who, after the termination 
of therapy, rated the initial level of personal 
integration of the client would have been op- 
erating on such long-term memory as to in- 
volve considerable guesswork coupled with a 
bias to underestimate the client’s initial level 
of integration. This hypothesis was tested by 
comparing the mean rating of initial integra- 
tion in the earlier study with the mean rating 
of initial integration in the present study 
when counselors were rating from short-term 
memory. The result supported the hypothesis. 


It was concluded that the use of log num- 
ber of interviews together with judgments 
made both at the beginning of therapy and 
at the termination appears to be the best 
present procedure for examining the relations 
between length of therapy and case variables 
obtained from counselor judgments. 
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BENDER-GESTALT FIGURE ROTATIONS: 
A STIMULUS FACTOR 


RICHARD M. GRIFFITH axp VIVIAN H. TAYLOR 


Veterans Administration Hospital, Lexington, Kentucky 


oe (1958) modified the Bender Visual 
Motor Gestalt Test (BG) by rotating the de- 
Signs through 90 degrees on their rectangular 
ae presenting the design to the subject as 
efore. With this new set of cards patients 
ee fewer rotations of figures, presum- 
Y because the longer axis of the card now 
Corresponded to the longer axis of the paper- 
Owever, his statistically significant differ- 
nce was due to a few of his controls pro- 
tiaa multiple rotations; examining his sta- 
ics it becomes evident that just as many 
m3 tients in one group as in the other rotated 
least one figure (8 out of 36 in each case). 
Che present study is essentially a replica- 
ìon of his; however, instead of redesigning 
the, cards, a comparable effect was attained 
tara Oe the expedient of rotating the Berets 
wi and paper being thus oriented length- 
— left-to-right instead of up-and-down as 
as his, 
how miners within a larg 
ad Pital were asked to rotate t 
ministering the BG. Habits being hard to 
Teak, not all did so. Those who did not torr 
> y unwittingly collected a “control” group of 
“Cords, which, as it turned out, matched the 


Exper; : 
a Perimental as to diagnosis. As the psycho 
Bical reports crossed the secretary’s desk ro- 

ntinuing over 


ations were noted, the study cO 

O-month period. An angular displacement 

at least 45 degrees in a recognizable figure 
pot criterion for rotation. 

tiie Si “tablet-turned” recor 

odie? toe conventional ones. . 
ified conditions 12.5% of the records ha 
the or more figure rotations VS- 29.370 [ad 
ing COnVentional, the two proportions difter- 
8 significantly at the -02 level (one-tailed 

89 


e neuropsychiatric 
e the tablet when 


ds were ob- 
Under the 


test). A chi square between the distribution 
of diagnoses in the two groups (five major 
diagnostic categories being considered) was 
small—0.650 for 4 degrees of freedom—per- 
mitting the conclusion that the groups were 
well matched according to diagnosis. 

The 29.3% rotations in the standard rec- 
ords were unaccountably higher than the 22.8 
previously determined from approximately 
1,000 records in the files of the same hos- 
pital (Griffith & Taylor, 1960). After a chi 
square test had shown that there had been 
no statistically significant shift in diagnoses, 
all the data collected under standard condi- 


“tions were combined for a total number of 


1,152 tests—23.5% with one or more figure 
rotations. This 23.5% differed just at the .05 
level of statistical significance from the 12.5% 
of records with rotations in the unconven- 
tional, tablet-turned group (one-tailed test). 
Hannah’s results would seem to be con- 
firmed. It may be concluded that many ro- 
tations are caused by the patient orienting | 
the design to the major axis of the paper in 
the same relation it bears to the major axis 
of the card, even though to do so involves 
actually turning the design in relation to him- 
self. The results fit into the pattern of in- 
vestigations begun by Shapiro (see Williams, 
Lubin, Gieseking, & Rubinstein, 1956) which 
relate the phenomena of rotations of both 
block designs and BG figures to stimulus 
properties of figure and ground. However, it 
should be pointed out that however success- 
ful we may be in pinpointing the stimulus 
variables which influence rotations, rotations 
do not thereby lose their diagnostic signifi- 
cance; as long as different diagnostic groups 
are influenced differently by the stimulus con- 
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ditions as they seem to be (Griffith & Taylor, 
1960) the rotation will still have diagnostic 
significance. 

To sum up, it was confirmed, through a 
replication of a previous study, that many of 
the rotations of the Bender-Gestalt figures 
may be attributed to the accidental circum- 
stance that the long axis of the test card is 
oriented at 90 degrees to the long axis of the 
paper upon which the figure is usually drawn. 


eskai 
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eh correlations have typically been found 
item cen a group’s self-ratings on an array of 
Ta i and the social desirability of those items. 
a or (1959) has challenged the conclusion that 
pio S's self-ratings reflect only his desire to 
uce a favorable self-picture. 
ae Present study replicates Taylor's design in 
no Paring individual and grouped data, but uses 
in ss Ss, a shorter -time interval between rat- 
Ta different instructions, and a different rating 
Self. sent —one whose test-retest reliability for 
eel had been studied.? In addition, it at- 
Vidor. to manipulate the desirability set in indi- 
i edi S by exposing them to a personally relevant 
ably DUOT to obtaining their self-ratings; presum- 
S this exposure would enhance the desirability 


ged incoming freshman male medical stu- 
cog, ranked the definitions of the 15 Murray 
east given by Edwards (1957) from most to 
rate characteristic of themselves, ‘and in a sepa. 
Suce ranking, from most to least characteristic © 
firer aail physicians, Forty Ss ranked the items 
or for themselves and immediately thereafter 
foll Physician (Group S-P), while the other 40 
Owed the reverse sequence (Group P-S). 

com, Group S-P, where the rating sequence was 
Wag rable to Taylor’s, the pattern of results 
twe, Milar to his. Rank-order correlation be- 
Self a average ranks assigned to the items for 
of a for physician was .89, while the median 
te individual correlations was only .63, with 

1 
tain, 
Bout 
from 
Do 


tio, 


E extended report of thi 
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© scale was administer 
On two occasions one week apart; 
al correlation was .86- 
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ed to 14 nursing St 
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BRIEF REPORTS 
THE SOCIAL DESIRABILITY SET IN INDIVIDUAL AND 
GROUPED SELF-RATINGS* 


NORMAN A. MILGRAM anp MALCOLM M. HELPER 


Nebraska Psychiatric Institute 


11 of 40 Ss having correlations below .44, the 
.05 significance level. These results support Tay- 
lor’s contention that for the self-ratings of a sub- 
stantial portion of Ss, factors other than social 
desirability set are operative. 

In the group receiving reversed-order instruc- 
tions (Group P-S), however, the median indi- 
vidual correlation (.85) was significantly higher 
than that in Group S-P (< .01, median test) 
and close enough to the correlation based on item 
means (.95) to suggest that little but the desir- 
ability set was operating in these Ss; only two 
individual correlations in this group fell below 
44, Apparently making physician ratings first en- 
hanced the desirability set in the subsequent rat- 
ings of self. 

That occupying the second position in the in- 
struction sequence modified the self-ratings in 
Group P-S, and not the physician ratings in 
Group S-P, is indicated by an additional finding: 
self-ratings in Group P-S were more uniform 
than in Group S-P, while there was no difference 
in physician ratings. When each S’s self-rating 
was correlated with the mean self-rating for his 
group, the median rho for Group P-S was .80 
and for Group S-P .62, the median test bein 
significant at the .01 level. Physician rating: 
were higher and equally uniform for P-S and 
S-P groups, the median rho’s being .87 and .85, 
respectively. 

In addition to corroborating Taylor’s findings, 
the present study provides evidence that the de- 
sirability set in self-ratings can be enhanced by 
simply having Ss make desirability ratings first. 
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CHOICE DISCRIMINATION IN SCHIZOPHRENIC AND NORMAL 
SUBJECTS FOR POSITIVE, NEGATIVE, AND NEUTRAL 


AFFECTIVE STIMULI* 


MILTON TURBINER 


Veterans Administration Hospital, Northport, New York 


This study was concerned with the effects of 
three varying affective stimuli on discriminative 
performance of schizophrenic and normal sub- 
jects (Ss). It was hypothesized that the apparent 
ineffectiveness of schizophrenic Ss in discrimina- 
tion tasks is related to certain motivational fac- 
tors within the stimulus situation. 

Twenty male schizophrenic Ss and an equal 
number of normal Ss were used in this study. 
Three series of pictorial stimuli were selected 
corresponding to the dimensions of positive, nega- 
tive, and neutral affective states. Each scene was 
represented by five pictures. Two of the series 
consisted of a social situation involving a female 
figure whose face and hands were clearly in evi- 
dence and a three-fourths rear profile view of a 
young child in the foreground. The third series 
consisted of a geometric design with the intent 
that these pictures were to represent a minimal 
amount of any given affective quality. The series 
categorized as negative affective contains the 
theme of reprimand by the central female figure 
with respect to the child; the positive affective 
series contains the theme of acceptance and de- 
sire for closeness on the part of the woman with 


1 This paper is derived from a doctoral disserta- 
tion submitted in partial fulfillment of the require- 
ments for the degree of PhD, Boston University 
Graduate School, 1955. Grateful acknowledment is 
due L. J. Reyna and J. V. Gilmore for their help 
and guidance. 

An extended report of this study may be obtained 
without charge from Milton Turbiner (Box 326, 
Veterans Administration Hospital; Northport, New 
York) or for a fee from the American Documenta- 
tion Institute. Order Document No. 6411 from ADI 
Auxiliary Publications Project, Photoduplication 
Service, Library of Congress; Washington 25, D. Eo 
remitting in advance $1.75 for microfilm or $2.50 
for photocopies. Make checks payable to: Chief, 
Photoduplication Service, Library of Congress, 


respect to the child. Size, position, and genet 
physical characteristics of the characters fot 
held constant in both affective series, except i 
a progressive alteration of the facial exe 
of the central figure and the change in the P 
tion of her hands—from those representing 
closeness to those representing rebuff. , 
The instructions required that upon si™ eat! 
ous presentation of the pairs of stimuli o0% 
series the Ss were to indicate whether the istut 
expressed in the central figure in both P tjo 
were the same or different. Similar instr" ge 
were given for the discrimination of d 5 
metric series. The scores obtained for & 
consisted of the frequency with which ihg o uč 
sponded “same” to a pair of different po 
and “different” to a pair of identical scen® cert” 
Coincidental with the writing of the oonivel 
tion from which this brief report is C% op’ 
Dunn (1954) published a study based "Pgd 
similar hypothesis and research design. bes wi 
ings of this study are generally consiste” out 
those reported by Dunn (1954). It WaS otf 
that the performance of the schizophrem'< © po 
was less effective in contrast to that of t pe 
mals with respect to negative as well acy 
tive affective stimuli discrimination. m A 
their performance was indistinguishable fr ect H 
manifested by the normal group with pesk 4 
the neutral stimuli. This is a clear indic? 
a capacity common to both normal a” 
phrenic groups, which, under predicted cor 
was not utilized effectively by the schizo! 
group as stimuli conditions changed. 
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CLINICAL PERCEPTION OF THE THERAPEUTIC 
TRANSACTION 


BERTRAM R. FORER, NORMAN L. FARBEROW, HERMAN FEIFEL, 


MORTIMER M. MEYER, 


Veteri 


T important facet of the diagnostic and 
da Sone work of the clinician is that the 

a upon which he makes his decisions be 
fa relevant and determinate. In all clinical 
elds there are data of such universality as to 
(F of essentially no differential significance 
a 1959). Diagnosis in internal medicine 
o me furthered by such facts as the existence 
o „a given organ. Similarly clinical psychol- 
Bists are minimally assisted in their decisions 
Y knowledge that their patients are human 
bese have problems, possess unfulfilled 
Pos S, and the like. Clinical practice presup- 
istics recognition and evaluation of character- 
Per S that vary sufficiently among clients to 

w the description of uniqueness. 
im Ut this is not enough. It is of additional 
Fe tange that characteristics 1M whic 
trae do vary be relatively stable for a given 

'vidual. If, for one client, a particular psy- 
om Bical characteri; were to manifest a 
ary Poral variation” approaching that of a 
clini sample of individuals at any one time, 

ical description would be largely a matter 
indivi The same would be true of the 
wh vidual described by a group of observers 

Ose ratings covered the total population 


wma given trait 
tuag an attempt to clarify some of the Be 
and Of clinical observation and descrip’ i 
Metin. experiment with suitable resea z 
ds for further work, the writers sought 
ow termine in concrete clinical aao 
Bre well a group of trained observers, m a 
clinic on what they perceived, what kin e 
ab}, ĉl material could be perceived most a 
lang and whether clarification of descrip’ “ é 
ment Ee and concepts would enhance agri 
_ among judges. 
Blonay erly part of Veterans Administration Re- 
Office, Los Angeles. 
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METHOD 


To this end a check-list was developed to include 
a sample of items which might be manifested in a 
therapeutic hour. Items were selected which would 
likely characterize some, but not all, clinical inter- 
actions. All items were cast in a form that permitted 
a rating of “present” or “not present.” The number 
of items changed as the experiment progressed as 
indicated in Table 1. Items were classified a priori 
by the judges as either “observational” or “infer- 
ential” according to the degree of extrapolation 
beyond immediate data believed necessary to make 
the clinical judgment. 

Judges were six diplomates in clinical psychology, 
five of whom had worked together in the same clinic 
for 8 to 11 years, and one who had been with the 
group for about 3 years and whose clinical training 
and background were similar in content and dura- 
tion. All were joint participants in training, super- 
vision, diagnosis, therapy, seminars, and research in 
a psychoanalytically oriented Veterans Administra- 
tion Mental Hygiene Service. 

The judges observed 50-minute psychotherapeutic 
sessions between a patient and a therapist through 
a one-way screen with a microphone-ampliñer 
hookup. Judges made independent ratings without 
discussion. Observations consisted of four phases: 


A:s. A male psychology trainee with a neuroti 
woman patient. They had been working in thal 


observation room for some time, but did not know 
of the group’s observation. There were three weekly 
sessions followed by a discussion and revision of 
some items. 

Bis. A female psychiatrist with a schizophrenic 
male patient. They had been working together for 
some time in another room and the therapist moved 
into the observation room at the experimenter’s re- 
quest. There were three sessions followed by dis- 
cussion and revision of items. 

B.o. The same patient-therapist team. There were 
three more weekly sessions followed by discussion 
and more extensive revision and redefinition of 
items (Table “Ue 

Cıa. A male psychology trainee with a neurotic 
male patient whom he had been seeing in the room 
for some time for supervisory purposes. They knew 
nothing of the experiment. 
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TABLE 1 


CHECK List FOR CLINICAL OBSERVATIONS 


1. Therapist was primarily active. (Active: verbal acti p i ; 
2. Therapist was primarily comfortable. (Comfortable: has sense of case with the patient. 
should be our concept of comfort for all therapists.) 


a 


Therapist’s major method was: 
3. Reflectiv 
4. Interpretive 
5. Supportive 
6. The session seemed to be focused on a problem. 


(6a, Therapist attempted to focus on a particul 
(6b. Patient tended to ramble from topic to topic.) 


o wu 


. There were silences. (Silence: period when no one is talking, 30 seconds or more) 
. If there were silences, they were usually broken by: 
a. the patient 


b. the therapist 
9. The hour was characterized by resistance. 


(9a. Patient expressed verbal disagreement with ther 
(9b. Patient interrupted therapist more than once.) 
(9c. Patient was halting in his speech: made pauses before 
(9d, Patient spoke mostly in generalities.) 

(9e. Patient used technical psychological terms, more than once.) 


10. Patient spoke in a monotone. 
11. Patient was fidgety or restless. 


Problem areas worked on were: 


12. Authority 

13. Sex 

14. Dependency 

15. Work 

16. Hostility 

17. Emotional control 

18. Symptoms 

» Relationships with people 
. Patient used gestures. (More than once) 
. Patient brought up a lot of material, 


. The material brought up was deep. 
é present to what has h 


23. Patient was experiencin 


(Material: variety or elaborat 
(Deep: (a) patient mean; 


‘appened in past [content], or (b) pr 
g affect. 


ion of content) 


‘oduces s 
If experiencing affect, it was: 


24. Anger 

25. Fear 

26. Sadness 

27. Anxiety 

28. Warmth 

29. Patient seems to be rigid. (Thi 

censored or uncensored, etc, 

30. Patient seems to show ability to 
. Patient seems capable of insight. 

32. Patient seems self-critical.» 

33. (Patient’s relationship to the 


S applies to fluidity a 
Not a judgment of ch: 
form close relationsh 


nd spontaneit: 


aracter structure.) 
ips. 


ar problem, cither content or dynamic.) 


, an oni 
words, incomplete sentences, more than 


ingfully relates something that is hapP 
omething with great affect) 


jo 
? tet 
y during the hour, whether ™ 


"= 


Frame of referent 


3) 
cage : scious fact 
apist’s interpretations, regardless of unconscious f: 


ce) 


nin’ 


i 


0 
J 
therapist during the hour s ; raph 

. * CEMEG tye or 

This refers to conscious feelings.) ed to be one of positive feelings 


34a. The nature of the transfer 


ence is predominantly 
34b. The nature of the transfer 


i i Positive, 
ence is predominantly 


negative. 


à Items in parentheses 
the last patient, 
h Items 19 and 32 


are clarifications or substitute items created during the rey 
ci 
added after first period of observation, 


pal 
i R sery 
sion period before the three ODS 
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TABLE 1—(Continued) 


Patient’. : à 
atient’s major defenses shown were: 
35. 
2 Fy . . 
{ 36. Repression (Rationaliz: 


Sh. cas . 
E Avoidance (Denial: avowed r 
| 98. Intellectualization (Explanation of one’s 


Projection (Attribution to other person of motives unacceptable to oneself) 

tion: logical excuse or justification of feelings or behavior) 

nonperception of reality situation, internal or external) 

feelings or behavior in terms of general or theoretical principles or 


abstract concepts) 
Isolation (Separation of feeling and idea) 


39, 


40, Reaction formation (Turning into the opposite, internal) 


41, 
42, 
43. 
44. 


Conversion (Displacemen 
Patient will complete therapy. 


ion 


Diagnostic impres 


hift of feeling from one object or person to another) 


Dai m 
Patient will need long-term therapy, 1} years or more, 


MN of the four sections, then, consisted of three 
therapi Observations of the same patient and 
Were Pist After cach triad of observations the drta 
the moa uned and definitions were clarified with 
et boa of establishing a clearer basis for judgment. 

Ween By, and Cia most items were defined as a 


Tes) sats 
cult of consultation with three psychoanalysts. 


Ne 

w pe X 
shea definitions were mimeographed on the rating 
by ae clear items were replaced 
“nore ‘near ational items. 


as described 


i € degree of interrater agreement W a 
hich 


(rms of the binominal expansion in wl 
is he a 5 for each item (present or absent), and # 
ally pbi of raters, generally six, but pice 
ability €. There is reason to question the ge - 
are a, value for many clinical data since such es a 

®Pt to occur in a clinical sample with varying 


ip i “pe. 
eniteneies and clinicians are likely to have differ- 
d most reason- 
in the 


of other information.” F 

ee then, was expressed at first in terms of p 

Soy Obtained from the binominal exp 

t wished to know whether our judges agreed more 

4 : rey i E. 

Hence Observational than on the inferential items 

to Compounding of probabilities was necessary 

b y Pound the probabilities over several series, 

A according 

hee Were converted into chi square accor 

arson’ i 2— —2 loge p as de- 
Ahed aS transformation: x 2 loge b as 


by Jones and Fiske (1953). 


and q was made 
present responses 
and setting up 
gests that 


by an empirical investigation of $ 
for eap iting the proportion of 
: dist “1 item over the 12 replications 
for p tibution, The median value of 48 su th 
5s i Sample of items the assumption that p= 
a reasonable one. i 
© issue of independence of the tests 13 of im- 
Db Nee here. Most of the items had been planned 
ible. Measure- 


Ment 2S Probably independent as Poss h: 
aus, Independence in this study impossible be- 
Meng any item about which there is complete agree- 
a Will Necessarily correlate perfectly, on 
3 item. 


other unanimous 
same patient can be 
] an open ques- 


» Negas: 
Wh naively, with every 
Onsi are replications on the sam 
ed independent events is stil 


Each chi square that enters into the compounding 
carries 2 df. Hence, the compound probability for a 
given item for three replications is the sum of the 
three chi square values and df=2x3=6. In 
similar fashion the compound probabilities of the 
amount of interrater agreement on the observational 
and inferential items were computed by totaling the 
chi square values for all observational and inferential 
items separately with df equal to twice the number 


of items. 
The important questions were: do the judges agree 


more or less in their ratings of observational and 
inferential items, and do judges agree more in suc- 
cessive periods of observation as a function of 
experience and redefinition of terms? Statistically, 
these questions reduce to the significance of the dif- 
ference between total amounts of agreement. Since 
our measures of agreement are expressed in terms 
of chi square, the statistical test is of the difference 
between two chi squares. To our knowledge, the 
only possible way of testing this difference is by 
means of the F ratio. The F ratio is defined (Peters 
& Van Voorhis, 1940, p. 420) as the ratio of two 
independent chi squares, each divided by its own df; 


In this case x’: is the summation of the chi squares 
representing the amounts of agreement on all items 
in one treatment (e.g. observational items) and 
df, is the degree of freedom (e.g., twice the number 
of observational items). x^ and df: are the corre- 


sponding values for the other treatment (e.g., infer- 


ential items). 
RESULTS 


To attain a statistically significant degree 
of rater agreement in a given replication of a 
single item all raters must give the same judg- 
ment (for the values of in this study). 
Unanimity yields a p value of .016 for six 


The nature of our findings, however. suggests 


tion. 
that they were nearly independent, 
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TABLE 2 


AMOUNT OF AGREEMENT AMONG Jupces ror Eacu ITEM, PATIENT, 
AND OBSERVATIONAL PERIOD 


Patient A Patient B Patient C 
Item i 2 3 ES 123 £ 4 5 6 #6) 1 @ 3 isl Toal 
Observational 

1 Al ät .02 02 01 02 02 01 a 

6 02 .02 01 = ae 
oe > 02 10 

ób -» 02 02 02° 001 

i 02 10 .02 02 .03 .001 .02 Al 41 10 Oo 
9a -> 02 02 01 

9b + ain 10 

9c w O 11 11 01 

9d + G 02 02 .001 

9e —> 02 .02 .11 01 

10 02 03 02 Oi. 11 dl At at 10 11 02 02 01 001 
11 03 10 .02 .02 .03 001 .02 .03 .02 .001 001 
12 02.03 01 A103 d0: 02. 11 102 01 001 
13 02 02 01 .02 .02 01 03 02 01 02 02 .02 .001 001 
14 03 .11 05 02 02 .03 001 02.11 05 001 
15 02 .03 .02 .001 ji m 40) %2 ii o5 001 
16 ai 10 A ; 
17 02 11 .03 01 .02 02 Ol 11 11 02 01 oot 
18 11 02 .02 .02 .02 .03 .001 .02 .03 .02 .001 02 .11 .05 001 
19 æ 03 02 .01 .02 ‘05 02 .03 .02 .001 .02 .02 .02 .001 001 
ay at ai 02 11 08 
21 11 02 11.03 o5 02 .02 11 01 oot 
32 > 03 41 a ii 03.05 02 10 ü 11 410 Di 

Inferential 

3 .02 .02 o1 02 .03 .02 .001 02 001 
3-5 02 03 02 .001 .02 02 03 .001 .02 03 .11 .01 11 ‘Oot 
9 1 03 11 02 nel a 11 10 = 

22 03 .11 05 02 10 Al 10 05 
23 02.03 02 001 11 02 05 .03 02 .02 02 001 001 
24 03 11 05 41 Al 02 02 11 02 02 01 ‘904 
A At 02 02 11 ‘11.03 05 1 Si 
26 02 02 01 02 05 Al 02 11 05 001 
a 02 02 01 02 5 11 it w Gor 
8 02.05 02 02 .03 .001 03 AO 02 .02 02 001 001 
29 AL 03.02 01 11 AL 02 02 01 ‘oot 
30 At 02 03 01 .02 03 02 001 .11 ‘001 
zi 02.03 02 001 11 ‘Al AL 02.02 02 001 ‘001 
34a e a .02 a 

> g 03 11 o o2 o o1 02 .03 .02 001 J At 40 o 
36 ; 01 02 16. 1t 03 .05 kii o1 
37 gi AL o AL 11 iò 

38 03 02 02 01 02 os di ‘AL 

: AL 03 10 Al 02 11 wo a 

39 11 AT 240) 02 v u g a Bt 
40-8203 02 001 02 02 o3 goi 02 03 02 001 D 02 o ‘OL Ol 
41 02 02 01 1.03.05 ae 

42 02 .03 .02 001 ee iant wi 
43 02 03 02 .001 .02 02 03 .001 02 .03 02 001 02 62 ‘02 001 001 


Note.,—All entries are expressed in probabilit: ~ 
a Item deleted at this point. Yel ee, 


d Item added after this point. 


© Item replaced at this point. 
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raters and .031 for five raters. In only one 
of the 12 replications did more than half of 
the observational items attain this degree of 
rater agreement. The same is true of the in- 
ferential items. The proportion of observa- 
tional items that showed significant agree- 
ment varied from 31.3% to 54.5% among 
the 12 observational periods. For the infer- 
ential items the range is from 28.0% to 
54.2% (Table 2). It is patent that significant 
agreement was not the rule. On two items 
only was agreement unanimous throughout 
the 12 replications: absence of reaction 
formation (Item 40), and need for long-term 
therapy (Item 43). A 
Most, but not all, of the items were sig- 
nificantly in agreement when the probabilities 
were compounded over the 12 replications. 
Even so, most items varied enormously in 
degree of agreement from replication to repli- 
cation; hence overall significance of agree- 
ment gives little ground for confidence at any 
one time or for any one case in clinical 
Observation. Description of the vicissitudes of 
a few items may be informative. The presence 
of monotonous speech (Item 10) was signifi- 
cant in only 5 of the 12 replications, restless- 
ness (Item 11) in 7 replications. Some content 
items were rarely significant. The problem 
area, hostility, a clinical favorite, was ye 
Once agreed upon unanimously. Presence O 
gestures (Item 20) was significantly agreed 
upon once in the series. 
Among the inferential 
terial (Item 22) was signi 
the clarifying definition. P 
Showed perfect agreement 
in the series; capacity for ins 
fect agreement consistently 
tient’s three replications, 
Patient’s as well, and not at a 
Patient’s six replications. Ju 
oe transference were nev 
Ccord; negative transference t 
accord for seven replications (all by ab- 
sence), Agreements in regard to €g0 defenses 
Were as follows: reaction formation—perfect 
Score, always absent; projection—four sig- 
Nificant agreements split between two pa- 
tients; denial, intellectualization, and isola- 
tion—each three times; rationalization—never 


Significantly agreed upon. 


items depth of ma- 
ficant twice—before 
resence of anxiety 
three times—early 
ight showed per- 
for the first pa- 
and for the last 
Il for the second 
dgments about 
er in complete 
atings were in 


While psychological defenses, it may be 
argued, are rather subtle and may become 
apparent only in intensive, therapeutically 
oriented observation, the lack of agreement 
in six successive observations periods with the 
same patient seems cause for some concern. 
The judges agreed only once in nine replica- 
tions on the item: The hour was characterized 
by resistance. This item was replaced for pa- 
tient C’s observations by Items 9a through 9e 
which were deemed to represent some of the 
observational components of resistance. Dur- 
ing the three observations of the last patient 
9c was significant all three times, 9a and 9d 
twice, 9c once, and 9b not at all. If this 
finding can be generalized, it suggests that 
agreement about some clinical observations 
can be improved by specifying concrete 
behaviors. 

Results of the comparison between ob- 
servational and inferential items were un- 
expected. First of all, neither observational 
nor inferential items showed a significant 
preponderence of items with unanimous 
agreement as tested by a four-fold chi square 
test. When the combined probabilities of 
observational items were tested against those 
of the inferential items, not a single F ratio 
reached a .05 level of significance for the 12 
replications individually, for any of the 
patient-therapist combinations, or for the 
12 replications. Within the limitations of this 
experiment, then, there was no difference in 
rater agreement as a function of the degree 
of a priori objectivity of the items (Table 38 

There was, similarly, no significant differ- 
ence in overall agreement (observational and 
inferential items combined) between patient- 
therapist combinations. That is, variations in 
the persons observed had no systematic effect 
upon the degree of agreement among the 
raters. Possible interactions with particular 
items may exist but are difficult to prove. 
And, finally and somewhat sadly, there was 
no improvement in degree of agreement as a 
result of practice, communication of criteria 
for rating each item, or specific definition of 
items. Some items improved and some deteri- 
orated. In fact, the highest summated chi 
square for blocks of inferential or observa- 
tional items or combinations of the two is 
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TABLE 3 


SuMMATED x? VALUES AND VARIANCES OF INTERJUDGE AGREEMENT IN RATINGS 
OF OBSERVATIONAL AND INFERENTIAL ITEMS 


Observational 


Inferential Total 
Observation 
Period x? Vv df x? if df xe 1 df 
Ay 67.4 2.41 28 124.94 2.61 48 192.34 2.53 76 
Ao 62.81 1.96 32 115.95 2.42 48 178.76 223 80 
As 80.06 2.50 32 138.53 2.89 48 218.59 2.73 80 
AAs 210.26 2.29 92 379.42 2.63 144 589.69 2.50 236 
Br 92.65 2.90 32 110.64 2.21 48 203.30 2.48 80 
B: 82.33 2.57 32 107.34 2,15 48 189.68 2.31 80 
Bs 65.84 2.06 Bz 93.78 1.88 48 159.63 1.95 80 
Bı-B; 240.83 2.51 96 311.77 2.08 144 552.60 2.25 240 
B; 77.50 2.42 32 114.22 2.28 48 191.72 2.34 80 
Bs 62.25 1.95 32 104.77 2.10 48 167.02 2.03 80 
Bo 79.07 2.47 32 107.40 2.15 48 186.47 2.27 80 
By-By 218.82 2.28 96 326.39 218 144 545.20 2.22 240 
Bi-B; 459.64 2.39 192 638.16 2.13 288 1,097.80 2.23 480 
Ci 125.82 2.86 44 109.52 2.28 48 235.34 2.56 92 
C: 126.25 2.87 H 121.79 2.54 48 248.04 2.70 92 
C; 103.28 2.35 H 119.82 2.5 48 223.1 2.43 92 
Ci-C; 355.35 2.69 132 351.13 2.44 144 706.49 2.56 276 
Total 
(A-C) 1,025.26 2.47 416 1,368.72 2.33 576 


not significantly different from the lowest 
value. 

When probabilities are compounded over a 
series of replications, it is possible for much 
variation in agreement to occur among repli- 
cations and still attain the .01 or .001 level 
of significance. To get the flavor of this 
variation it might be worthwhile to examine 
the behavior of one item. Item 31, capable 
of insight, was unanimously agreed upon for 
the three replications of Patient A; for Pa- 
tient B the agreement was 5/6, 3/6, 4/5, 
5/6, 3/5, and 5/6 (none of them significant) ; 
for Patient C all three replications were in 
total agreement. The compounded p value is 
beyond .001. Agreement was perfect for two 
patients and clearly in the direction of agree- 
ment, though not significantly, for the third 
patient. It may be argued that the nature 
of the patient is an important consideration. 
Perhaps so. Evaluation of insight possibilities 
in psychotic patients such as B may be less 
certain than in neurotics and the role of 
insight in improvement may be thought to 
differ as well. On Item 42 only the first three 


replications yielded significant agreement, yet 
p reached .01. 

In order to obtain some estimate of inter- 
rater agreement in quantitative terms, the 
judgments on all items for each judge were 
set up in four-fold tables singly with each 
other judge. Phi coefficients were computed 
for each pair of judges on each of the last 
three observation periods. 

Phi coefficients ranged from .27 to .70 with 
median phi’s for the three successive periods 
of .49, 45, and .40 in that order. Even 
though the phi coefficients are lower than 
Pearson’s rs would be, their values are higher 
than those of Gelfand, Quarrington, Wide- 
mann, and Brown (1954) on rating scales of 
Rorschach traits and Lisansky’s (1956) phi 
coefficients for questionnaire items derived 
from the Rorschach. They also exceed the 
intercorrelations obtained by Stern, Stein, 
and Bloom (1956, p. 113) from Ọ sorts on 
the basis of school records, projective data, 
and behavioral observations. While 38 of our 
45 phi’s are significant at or beyond the .01 
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level, the magnitudes are still distressingly 
low for purposes of prediction. 

One judge was consistently highest in his 
median phi’s with other judges. There was no 
consistency as to who correlated lowest with 
the others. 

Of the items which were retained through- 
out the investigation, only two observational 
items (16 and 20) and one inferential item 
(34a) failed to be agreed upon beyond chance 
expectations. This finding can be interpreted 
in contradictory ways. Since the judges 
agreed beyond chance on most of the items 
over the whole experiment, they were evi- 
dently perceiving something in the way of 
communality. On the other hand, they were 
less consistently in agreement than seems 
desirable and their agreement fluctuated in 
no predictable fashion. 

One can become lost in trivia and post hoc 
rationalizations in examining the behavior of 
Specific items. Since a number of items 
showed unanimity in the judgments of the 
raters for one or more patients and not for 
Others, it might be suspected that some pa- 
tions present more clear-cut evidence about 
some clinical variables than other patients do 
and that patients differ in the kinds of clinical 
data for which they show evidence. To be 
Sure, patients vary somewhat from hour to 
hour and it may be expected that their ob- 
Servers and therapists do also. Our data = 
be interpreted in whichever direction the 
reader’s bias lies. It can be argued that the 
Variations in amount of agreement from inter- 
view to interview render the clinical data 
Practically useless for a given clinical ian 
tion. On the other hand, the fairly high mn 
Of agreement over the 12 replications 1n E 
Cates that there is significant communality o 
Clinical perception. 

Tt should be remembered | 
Mum number of raters was six and that ai 
divergent opinion of one rater makes ~ 
a difference in the amount of es 
larger number of raters would lessen the effect 


Of a single rater. 


that the maxi- 


DiscussION 

A rough generalization that can be 
om these findings is that for most o ' 
items the judges agreed in the combine 


data beyond chance expectations, but that 
the degree of agreement varied from item to 
item, patient to patient, and replication to 
replication in no predictable fashion as others 
have found (Forer, Farberow, Meyer, & 
Tolman, 1952; Gelfand et al., 1954) in their 
study of Rorschach ratings. Such unsystem- 
atic variation in agreement does not neces- 
sarily mean that clinical observation is too 
subjective to be of practical significance. It 
does mean, however, that some of the pa- 
rameters of clinical observation and inference 
could, perhaps, be profitably re-examined and 
reformulated. 

We might ask ourselves whether the 
present experiment is a fair test of the clinical 
interview. Was the situation real enough; did 
it tap variables that are ordinarily involved 
in therapists’ observations? Would a therapist 
ask himself such questions during a therapy 
session, or are these judgments generally the 
result of summarizing observations gleaned 
from long-term contact with the patient? 
Yet, many of these variables are assessed at 
the end of single initial interviews with 
reference to diagnostic or therapeutic goals. 

Studies of the accuracy of clinical judgment 
and prediction have yielded little evidence of 


general superiority attributable to profes- 
sional training (Cline, 1955; Grigg, 1958: 


King, Ehrmann, & Johnson, 1952; Lisansky, 
1956; Luft, 1950, 1951). Our judges’ senior 
status suggests a fairly high level of profes- 
sional competence, but it does not imply that 
further development of skills or radical 
changes in orientation are unlikely to occur, 
even though we would ordinarily assume 
that they had reached an asymptote in their 
perception of most of the variables used in 
this experiment. Evidence is not impressive 
that homogeneity of training necessarily con- 
duces to homogeneity of judgment. Is it, 
then, a fact that for certain kinds of clinical 
judgment the variance attributable to indi- 
vidual differences among judges exceeds that 
attributable to professional training? It seems 
so in this study. We are forced to agree with 
Bendig (1956) that individual differences 
among judges may outweigh many facets of 
the rating process and the data to be judged. 

Discussion of terms after each series of 
three observation periods had no measurable 
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effect on degree of agreement. Lisansky 
(1956) believes that improvement can occur 
in the rating of Rorschach protocols, despite 
her and our (Forer et al., 1952) empirical 
findings to the contrary. Wiener’s (1958) 
belief that “We can and must train ourselves 
to agree on the judgments we make from 
projective test protocols” seems more a wish 
than a likely prospect. 

There is a possibility that amount of rater 
agreement is inversely related to the amount 
of data which the clinician must process. The 
task of sorting a large mass of heterogeneous 
data may create interference with the evalua- 
tion of any one of the classification variables. 
Evidence suggests that there may have been 
too much rather than too little information, 
possibly of a contradictory nature, and too 
many ratings competing for the observer’s 
attention (Borke & Fiske, 1957; Cutler, 
Bordin, Williams, & Rigler, 1958; Gage, 
1953; Giedt, 1955; King et al., 1952; 
Kostlan, 1954; Luft, 1950, 1951). 

Perhaps clinicians need to take stock of 
what they are asking from themselves, to ap- 
praise realistically rather than hopefully 
what is possible, so that they need not be 
unduly apologetic nor defensively nihilistic 
toward research evidence that questions their 
prowess. 

It seems unlikely from this and other 
studies that any conceptual system or any 
amount of training can engender the degree 
of conformity or reproducibility in clinical 
perception and judgment that is achieved by 
standardized tests or electronic computers. 
Would such a state of affairs be desirable? 
The growing supplementation and frequent 
replacement of objective tests by projective 
methods in diagnostic work suggests that ob- 
jective methods leave something to be desired. 
The price of objectivity is limitation of in- 
formation, and the clinician feels the need 
for more and different kinds of information 
than that provided by objective tests, even 
though the information be of a lower order 
of reliability. Factually he deals with pa- 
tients’ verbalizations which are also of a low 
order of reliability and it is through his 
inferences that the clinician constructs a 
relatively stable conceptual model of his 
patients. 
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Complete unanimity of clinical judgment 
would represent constriction of the range of 
cues and of clinical attention, hence of thera- 
peutic activity. Zero variance among thera- 
pists would likely generate low variance in 
therapeutic activity. But therapists differ 
inevitably as persons, in their preferred 
theoretical systems, in their ability to use 
particular techniques, and in their apparent 
effectiveness in dealing with different kinds 
of patients. The all-around therapist who 
works equally well with all kinds of patients 
is as much a myth as the psychological test 
that measures every aspect of the psyche. 

It may be that the less than perfect agree- 
ment in the clinical observations described 
above reflects those individual differences 
among therapists that enable them to special- 
ize, learn from one another, grow continually 
in their skills, and discover new concepts and 
techniques. 


SUMMARY 


As a means of investigating the reliability 
of psychologists’ perception of clinical data, 
six diplomates in clinical psychology observed 
three patient-therapist teams for a total of 
12 weekly psychotherapy sessions. Independ- 
ent ratings of present or absent were made on 
a check-list containing a number of presum- 
ably observational and inferential items. 
After each series of three sessions the 
clinicians discussed, redefined, and replaced 
items with the goal of increasing interjudge 
agreement. 


1. On very few items was there consist- 
ently significant agreement among the judges. 

2. The amount of agreement on most items 
varied from session to session and patient to 
patient in no detectable pattern. 

3. While the amount of agreement com- 
pounded over the 12 sessions was significantly 
beyond chance expectations for most items, 
it was not sufficiently substantial to warrant 
confidence in the judges’ observations. 

4. The amount of agreement among judges 
was not affected by the apparent objectivity 
of the item. Even more precisely operation- 
ally-defined items—such as silence of 30 sec- 


onds or more—were not consistently agreed 
upon. 


, 


4 


v> 


Clinical Perception of Therapy 101 


5. There is no evidence that practice in 
judging, increased contacts with a particular 
patient, or discussion and clarification of 
items enhance objectivity. 
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This study was concerned with the thera- 
pist, the patient, and their relationship in 
psychotherapy. It dealt with authoritarianism 
as a personality trait in each of the indi- 
viduals, and tested for associations between 
authoritarianism as a trait, attitude, and be- 
havior. A major hypothesis of this study was 
that the peculiar interaction of authoritarian- 
ism in therapist and patient would be crucial 
to the development of the therapeutic rela- 
tionship. Although this study did concern it- 
self with authoritarianism, this trait was not 
necessarily thought to be the most basic or 
critical aspect of the therapeutic relationship. 
It was selected for study here to demonstrate 
the importance of considering the personality, 
needs, and motives of therapist and patient, 
as they interact in the therapeutic relation- 
ship. 

Of the many personality variables that 
might be studied in this manner, the writer 
chose to consider authoritarianism, as deline- 
ated in the major work on The Authoritarian 
Personality (Adorno, Frenkel-Brunswik, Lev- 
inson, & Sanford, 1950). Tt was thought that 
the patient population of any clinic might 
not be as individualistic, equalitarian, and 
self-actualizing as some writers seemed to as- 


sume. Further, even a generally equalitarian’ 


patient may develop authoritarian expecta- 
tions about psychotherapy from his experi- 
ence with other professions. For the therapist, 
we know that there is a wide range of thera- 
peutic behavior in terms of training and ori- 
entation, to say nothing of the range of atti- 
tudes and needs they may have. Authoritar- 
ianism, then, was thought to be one of the 


! Based on a doctoral dissertation submitted to the 
University of Chicago, 1959. The writer is indebted 
to Donald W. e, Desmond S. Cartwright, and 
Ralph W. Heine for their encouragement and help. 


trait dimensions relevant to patient and thera- 
pist roles. 

A major issue which still surrounds authori- 
tarianism as measured by the F Scale refers 
to the question of its behavioral implications 
and correlates. Titus and Hollander (1957) 
raise serious question about the relationship 
between authoritarian attitudes and behavior. 
They urge special caution where interper- 
sonal behavioral implications are to be drawn. 
Christie, on the other hand, makes a strong 
case for congruency of F Scale scores and 
predicted behavior, citing four studies in sup- 
port of his position (Christie & Jahoda, 1954, 
p. 145). As a test of the question, this study 
hypothesized that authoritarianism, as a trait 
of therapists and patients, would find expres- 
sion in their attitudes toward psychotherapy 
and their behavior in therapy. 

The second basic hypothesis of this study 
follows from the argument that a similarity 
of personality traits in patient and therapist 
tends to facilitate the relationship. Barron's 
thesis (1950) seems to be the first study to 
consider both patient and therapist variables 
in an experimental approach to the thera- 
peutic relationship, Axelrod (1951) argued 
and found partial support for the hypothesis 
that progress in therapy was more likely when 
the personalities of patients and therapists 
were similar than when they were dissimilar. 


ns this hypothesis Was .the theory 
that 


the presence of an emotional identification or em- 
pathy between patient and therapist, springing from 
common emotional experience and manifested more 
OF less: by "a similarity of Personalities, is a condition 


favorable for the successful development of the 
therapeutic process (pp. 4-5) 


Studies by Bown (1954), Hiler 


i ; (1958). 
Libo (1957). and Ash 


by, Ford, Guerney, and 
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Guerney (1957) are pertinent considerations 
of this question. Although the evidence is 
something less than substantial, there does 
seem to be a line of thought suggesting that 
there is an interaction between the person- 
ality traits of therapist and patient, and that 
generally a similarity of traits tends to facili- 
tate the relationship. This position receives 
some support from studies in the fields of 
leadership and education (Goldberg & Stern, 
1952; Haythorn, Haeiner, Langham, Couch, 
& Carter, 1956; Jones, 1954; Sanford, 1950). 
The second basic hypothesis of this study 
states that a similarity of therapist and pa- 
tient along the trait dimension of authori- 
tarianism—equalitarianism is related to the es- 
tablishment of successful or good therapeutic 
relationships. e” 
Sanford (1956) raises some question about 
whether authoritarian patients, without refer- 
ence to therapist traits, may not have real 
difficulty forming therapeutic relations with 
é E her blunt on this 


any therapist. He is rat 
point, writing: 


sS 


The person high on F rarely seeks, but rather pee $ 
the idea of psychotherapy ; and once a start has a 
made, the technical problems are trying (p. 313). 

Sanford then goes on to note a study by 
Freeman and Sweet (1954) in which they 
offered evidence that patients with many fea- 
tures of the F pattern actually respond Dpt ier 
in certain forms of group therapy than they 
do in individual therapy. This eng: a : 
viously, refers to patient traits nl “2 
More parsimonious explanation of hepa 
failure the question merits testing an s 


a specific hypothesis of this study. 


METHOD 
Instruments 


The California F Scale as 
tarianism was taken directly 
as published by Adorno et al. ON EY 
is best to use some prewar angin M e 
to keep order and prevent chaos, Tae il 
untimely and probably ambiguous to ri ea Bae 
Scares wore derived in ere converted 10 positive 
Sponses fr —3 to +3 were ae. 
scores maa from 1 to 7, with no espona mor 
as 4. The sum of scores pa items was us 
for f hypotheses in this study. , 

a oe n Equalitarian Therapy = eel 
Nated as AET) was especially developed for this 


a measure of authori- 
from Forms 45 and 40. 
(1950). One item: “Tt 


ship” (nine-point scale from 
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study. A 40-item card sort was constructed contain- 
ing 20 items reliably prejudged as descriptive of an 
authoritarian therapy relationship, and 20 items simi- 
larly prejudged as descriptive of an  equalitarian 
therapy relationship. By verbal instruction the sub- 
jects were asked to sort the 40 items in 8 piles of 
5 items each. The piles were numbered from 1 to 8, 
pile Number 1 designated as “Least True or False,” 
pile Number 8 as ost True.” Patients were asked 
to sort the 40 items to indicate “which of these 
things you would like to have be most true and 
which of these things you would like to have be 
least true, or even false, about the relationship be- 
tween you and your doctor (therapist, counselor) .” 
Therapists were given essentially similar instructions 
with added emph: on the expression of “own 
opinions” rather than what they had been taught or 
had read. Each item was scored according to the pile 
number in which it was placed. The scores for the 
20 authoritarian items were summed for each sub- 
ject and designated the AET score, with a possible 
range from 50 to 130, For each patient-therapist pair 
an AET Discrepancy Score was computed by sum- 
ming the squares of the score differences over the 40 
items. This Discrepancy Score is, of course, a nega- 
tive function of the correlation between patient and 
therapist sorts. The Discrepancy Score was consid- 
ered an adequate representation of similarity and 
differences of patient and therapist attitudes toward 
therapy along the specific dimension of authoritarian- 
equalitarian attitudes and behaviors. 

A Therapist Rating Scale was developed, drawing 
heavily from an instrument developed at the Uni- 
versity of Chicago Counseling Center (Rogers & 
Dymond, 1954, p. 101) and currently in use there. 
Several items of the original form were omitted to 
produce a shorter rating blank. A new item was in- 
troduced in which the therapist was ed to rate 
the “quality of the relationship,” thus: “Does this 
seem to be a ‘good’ and effective therapeutic rela- 
tionship? How do you estimate the quality of the 
therapeutic relationship between yourself and this pa- 
tient?” (nine-point scale from “poor” to “good”). 
The rater’s estimate of patient satisfaction in the re- 
lationship, was retained in its original form, thus: 
“Estimate the patient’s feeling about the relation- 
“strongly dissatisfied” 
to “extremely satisfied”). Only these two items are 
utilized in the present study, 

The formation of successful or better therapeutic 
relationships as a criterion was assumed to be di- 
rectly related to the various types of criteria em- 
ployed in other studies, but it was thought to have 
a specific pertinence of its own, as elemental or more 
basic. It seemed reasonable to attempt a direct meas- 
ure of the quality of the relationship. It was as- 
sumed that the quality of the relationship is largely 
determined and may be evaluated in the very early 
contacts of patient and therapist. Although a rating 
of patient satisfaction may not be one of the essen- 
tial goals of psychotherapy and may not be directly 
related to the quality of the relationship, it was 
thought to be a useful supplementary criterion meas- 
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ure. No matter how good the quality of the relation- 
ship may appear to the therapist or judges, the de- 
gree of patient satisfaction with its implications for 
continuance in therapy or premature termination 
may be a crucial evaluation. 

An Observer Rating Scale was developed for the 
use of judges in rating patient and therapist behav- 
iors as observed on short recorded segments of ther- 
apy. Items 1 and 2 provided estimates of the qual- 
ity of the relationship and patient satisfaction, and 
were identical in form to the items described above. 
In Item 3 the therapist’s behavior in the recorded 
segment was rated on five dimensions: aggressive- 
submissive, directive-nondirective, highly anxious-low 
anxiety, dominating-equalitarian, and rigid-flexible. 
In Item 4 the patient’s behavior was rated on these 
five dimensions: aggressive-submissive, dependent- 
self-sufficient, highly anxious-low anxiety, conven- 
tional-individualistic, and rigid-flexible. From the 
many qualities and behaviors attributed to authori- 
tarians in the literature, these five in each case were 
selected as being both relevant and ratable. In Item 
5 the judge was asked to rate the behavior of the 
therapist along the single dimension of authoritarian- 
equalitarian on a nine-point scale. In Item 6 a dis- 
tinction was made between dominant and submissive 
types of authoritarian behavior by the patient. Domi- 
nant behavior was defined by aggressive active au- 
thoritarian behavior, while submissive behavior was 
defined by passivity or deference, expecting or seeking 
authpritarian behavior by the other. Although domi- 
nant and submissive authoritarian behaviors were 
thought to be dynamically related, it seemed plau- 
sible to consider the two aspects mutually exclusive 
in any short sample of behavior. Thus, a V-shaped 
scale was used, with equalitarian at the apex and 
authoritarian-dominant and authoritarian-submissive 
at each of the two extensions, each on a nine-point 
scale. The judge was asked to select the aspect most 
prominent in the given segment and make a rating 
on the selected scale. 


Samples 


The subjects were drawn from two clinic popula- 
tions. Those designated as Group A include treatment 
cases in the Psychiatry Clinic of Albert Merritt 
Billings Hospital, University of Chicago. Senior medi- 
cal students are required, as part of their training in 
psychiatry, to treat in psychotherapy one selected 
patient who has been referred to the clinic. It should 
be noted that these students had little training and 
no prior experience in psychotherapy. All patients 
were told that their treatment would be limited to 
18 weeks’ duration, after which they would be either 
terminated or referred elsewhere, The present sample 
is composed of patients and therapists drawn from 
this program during two successive quarters. Of the 
35 patients originally tested for this study, 1 was 
eliminated because of a suggested organic involve- 
ment, 1 for an alleged inability to read, and 1 pa- 
tient who failed to keep the first and subsequent 
therapy appointments. The remaining sample of 32 
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cases included 15 males and 17 females, with a mean 
age of 38 years, ranging from 23 to 68 years. 

The subjects designated as Group B were drawn 
from the client population of the University of 
Chicago Counseling Center. Clients are normally as- 
signed to therapists on the basis of therapist avail- 
ability, and clients who agree to participate in re- 
search studies are then randomly assigned to proj- 
ects in progress at that time. The present sample 
includes 30 cases assigned to the writer's project over 
a 6-month period. The therapists in this group in- 
cluded three staff members with extensive experi- 
ence, seven staff members with some or considerable 
experience, and seven students in training who were 
seeing their first or second cases. The client popula- 
tion included 16 males and 14 females, with a mean 
age of 27 years, ranging from 19 to 43 years. 

The population in Group A includes 32 patients 
and 32 therapists, each patient seeing a different 
therapist. In Group B, the population includes 30 
clients and 17 therapists, several therapists treating 
more than ore client in this sample. 


Collection of Data 


Patients and therapists were seen prior to their 
first therapeutic interview and were asked to com- 
plete the F Scale and AET sort. After the second 
therapeutic interview, the therapist completed the 
Therapist Rating Scale. 

Observer ratings were made on Group A only. Re- 
cordings of the first interview were retained, and 5- 
minute segments were selected from the beginning 
and ending of each interview. These segments were 
rerecorded in random order, with at least five other 
segments between the two segments of any given in- 
terview. Two judges (the writer and another gradu- 
ate student of psychology, both with training and 
experience in psychotherapy) rated each segment on 
the Observer Rating Scale. Thus, for each case there 
were four ratings: beginning and ending segments by 
each of two judges. One recording was inaudible and 
tests based on judges’ ratings will be drawn from an 
N of 31. Reliability of judges’ ratings was tested on 
each of the 14 scales of the rating form, The two 
judges’ summed ratings (beginning plus ending seg- 
ments) were significantly correlated on 10 of the 14 
(9 at the .01 level, 1 at the 05 level: r = 38). Only 
these 10 items were utilized in this study. It is strik- 
ing that the four items on which the j 


i udges were not 
ìn agreement all dealt with patient tr 


aits. 
RESULTS 


Authoritarianism, as a personality trait of 
the therapist, was hypothesized to be signifi- 
cantly related to his description of the ideal 
therapeutic relationship in terms of directive, 
paternalistic, and nurturant qualities. Full 
scale scores on the F Scale were compared to 
AET scores. For the 32 therapists in Group A 
the Pearson 7 correlation was .03, a clearly 


pe 
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nonsignificant result. For Group B, with 17 


therapists, the Pearson 7 correlation was .62, ` 


significant at the .01 level. 

It was predicted that therapists character- 
ized by authoritarianism would tend to show 
more authoritarian behavior in their therapy 
than those characterized as equalitarian. The 
Observer Rating Scales were utilized here. The 
31 therapists were dichotomized on the basis 
of their F Scale scores, 16 low and 15 high. 
Results in tests of this hypothesis may be 
summarized as follows: (a) On a global be- 
havioral rating of authoritarian-equalitarian 
the high F scorers were rated significantly 
more authoritarian than low F scorers. (b) 
Although high and low F scorers did not 
differ on the full scale dimension of aggres- 
sive-submissive, they did differ on their de- 
viation from “appropriate” mid-point behav- 
ior, i.e., high scorers were given more extreme 
ratings on this dimension. (c) High scorers 
were rated as more directive, anxious, and 
dominating than low scorers, but not signifi- 
cantly so. (d) Behavior of high scorers ~~ 
rated as significantly more rigid than that o 
low scorers. s : 

Authoritarianism, as a personality trait of 
the patient, was hypothesized to be m 
cantly related to his description of the idea 
or preferred therapeutic relationship in tems 
of directive, paternalistic, and nurturant qua - 
ties. Full scores on the F Scale were compare 
to AET scores. The 32 patients in Group A 
showed a Pearson r correlation of .34, signifi- 
cant at the .05 level. In Group B, with 30 pi- 
tients, the Pearson r correlation was .38, sig- 
nificant at the .05 level. : 

It was predicted that patients characterized 
by authoritarianism would tend to show res 
authoritarian behavior in their therapy ccm 
those characterized as equalitarian. pi C : 
Server Rating Scales were utilized ioe My 
global rating of patient behavior on a. 
Point scale and a rating on patient agg 
sion. As a test of this hypothesis the 3 cases 
Were dichotomized on the basis of t 2 is 
tient’s F Scale score, 15 low and 16 ne : a 
the global rating of patient behavior the di : 
ference between low and high scorers was no} 
Significant, The two groups did not differ on 
the full scale dimension of aggressive-submis- 
sive, High scoring patients did show the larger 
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deviation from “appropriate? mid-point be- 
havior as predicted, but the difference be- 
tween groups was not significant. 

In line with the argument of Sanford, dis- 
cussed above, it was predicted that patients 
who are characterized as equalitarian will tend 
to form better therapeutic relationships than 
those characterized as authoritarian. In Group 
A, the hypothesis was tested against four cri- 
terion measures: the therapist’s rating of the 
quality of the relationship, therapist’s esti- 
mate of patient satisfaction, judges’ composite 
rating of the quality of the relationship, and 
the judges’ composite estimate of patient 
satisfaction. The differences between low and 
high scoring patients on the therapist ratings 
were not significant. The differences on judges’ 
ratings were both in the predicted direction. 
Judges rated the quality of the relationship 
significantly (¢ = 2.50, p< .01) higher for 
the group of low F scorers, and the estimate 
of patient satisfaction was slightly higher for 
this group but not significantly so. In Group 
B the hypothesis was tested against two cri- 
terion measures, the therapist’s rating of the 
relationship and his estimate of patient satis- 
faction. Differences between low and high 
scorers were not significant. 

The last three hypotheses were developed 
from the argument that similarity of patient 
and therapist personalities facilitates the de- 
velopment of good therapeutic relationships. 
It was hypothesized that patients character- 
ized by authoritarian traits would tend to 
form better therapeutic relationships with 
therapists characterized as authoritarian than 
with those characterized as equalitarian. For 
a test of this and the following hypothesis the 
dichotomies between high and low scorers in 
patient and therapist groups were retained. 
First, each of the patients characterized as 
authoritarian was considered with his respec- 
tive therapist. Mean criterion ratings are 
shown in Table 1. In Group A the hypothe- 
sis was tested against the four criterion meas- 
ures listed above. All differences were nonsig- 
nificant. In Group B the hypothesis was tested 
against the two criterion measures listed 
above. Both differences were nonsignificant. 

Secondly, it was hypothesized that patients 
characterized by equalitarian traits would 
tend to form better therapeutic relationships 
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TABLE 1 


MEAN CRITERION RATINGS ON AUTHORITARIAN AND EQUALITARIAN Groups or P. 
THEIR RESPECTIVE AUTHORITARIAN AND EQUALITARIAN THERAPIS 


ATIENTS WITH 


Criterion 
Group A Group B 
Therapist Observer Therapist 
Rating Rating Rating 
QR” PS QR PS QR PS 
Authoritarian Patients 
Authoritarian Therapist 5.89 6.11 3.44 4.95 5.80 6.20 
Equalitarian Therapist 7.14 6.71 4.14 5.57 5.00 5.25 
Equalitarian Patients 
Equalitarian Therapist 5.89 5.55 5.39 5.97 5.70 5.40 
Authoritarian Therapist 6.00 6.14 4.38 4.75 6.28 6.80* 


3 OR = Quality of Relationship, PS = Patient Satisfaction, 


gi _ * Difference significant at .05 level, in a direction opposite to that predicted. 


wily 


with therapists characterized as equalitarian 
than with those characterized as authoritarian. 
Each of the patients characterized as equali- 
tarian was considered with his respective 
therapist. Mean criterion ratings are shown 
in Table 1. For Group A, on the four cri- 
terion measures, all differences were nonsig- 
nificant. In Group B both therapist ratings 
were in a direction opposite to that predicted, 
with the difference on rated patient satisfac- 
tion significant at the .05 level. 

In the last hypothesis, therapist and pa- 
tient descriptions of ideal or preferred ther- 
apy conditions (AET) were utilized. Discrep- 
ancy Scores for each case were computed as 


TABLE 2 


MEAN CRITERION RATINGS ON CAsEs with HIGH AND 
Low DISCREPANCY BETWEEN THERAPIST AND 
Patient AET Sorts 


Criterion 


Group A Group B 
Therapist Observer nist 
Rating Rating 


Patient Group QR" PS QR PS 


High Discrepancy 6.06 6.00 4.16 512 5.00 5.53 
Low Discrepancy 6.31 6.19 4.55 5.58 6.33* 6.13 


* OR = Quality of Relationship, PS = Patient Satisfaction. 
* Difference significant at .05 level, 


previously described. These scores were di- 
chotomized in terms of low and high discrep- 
ancy. It was predicted that the quality of the 
therapeutic relationship would be related to 
the degree of discrepancy between patient and 
therapist expectations of authoritarian atti- 
tudes and behavior in therapy. Mean criterion 
ratings of low and high discrepancy groups 
are shown in Table 2. Although only one of 
the differences was statistically significant, 
low discrepancy cases received higher ratings 
on all criterion measures in both groups. 


Discussion 


The failure to find a relationship between 
F Scale scores and attitudes toward therapy 
in the therapist population of Group A may 
be related to the nature of the F Scale items 
and the students’ reaction to them. It has 
been said 


that authoritarian People as measured by the scale 
agree more with authoritative statements; and that, 
therefore, a portion of the discriminatory power of 
the F scale derives from its form, rather than its con- 
tent (Leavitt, Hax, & Roche, 1955, p. 221). 


The very authoritative tone of the statements 
in the F Scale, referred to as a form charac- 
teristic, may, however, operate with reactive 
effect on some subjects. Several of the thera- 


ay 
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pists (who, it will be recalled, were senior 
medical students) commented on the stringent 
wording of the statements. One student com- 
mented that; “In medical school one of the 
first things you learn is to suspect any state- 
ment with ‘always’ or ‘never’ in it.” These 
are not individuals who are rigidly or self- 
consciously equalitarian, but rather students 
trained to be critically sensitive to the literal 
meaning of words, and to hold in suspicion 
all authoritative sounding statements. The 
form component may, in such cases, have an 
inhibitory, and thus invalidating, effect. 

Since therapists’ scores on the F Scale do 
correlate quite well with their rated behavior 
in therapy it may be more reasonable to view 
their F Scale scores as a relatively reliable 
representation of authoritarianism as a per- 
sonality trait and to re-evaluate their expres- 
sion of attitudes toward therapy. It is well to 
remember that this population of therapists 
is composed of students with no experience 
and very little training in psychotherapy. 
They probably had few consciously developed 
attitudes toward therapy. By contrast, me 
therapists in Group B, with more training 
and experience in therapy. do show a Eo- 
sistency between personality trait and ther- 
apy attitudes. It may be proposed that one 
of the consequences of training and experience 
is the increased congruence of therapist traits 
and attitudes, a greater consistency between 
the personality of the therapist and his “oe 
sciously held and expressed attitudes toward 
therapy. Whether such congruency E eth 
fect of training or experience, OF both, 
and should be tested. 

It was oe that the judges rated the a 
ity of the relationship significantly ed 
the patient group of low F scorers, while T 
ferences between therapist ratings were = 
Significant, It may be that this ce _ 
differences in conception of the ae 
tient” role, and differences in what ae nee 
a “good and effective therapeutic ge 
ship.” Some differences in perspective DE be ae 
therapist and judges may also be opera 
ere, i 

The finding that the rated quality o 
relationship is related to the degree of s ; 
larity of patient and therapist og m 
a preferred relationship on items specifically 
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defining authoritarianism tends to support 
the second basic hypothesis of this study. 
The quality of the relationship and an esti- 
mate of patient satisfaction in the early inter- 
views appear to be somewhat predictable. To 
say this in another way: there does seem to 
be some pretherapy data from which we could 
anticipate good or poor, satisfying or unsatis- 
fying, therapeutic relationships. 

An observation may be made on the failure 
to find a relationship between the criterion 
and similarity on F Scale scores. Dichotomiz- 
ing cases at the mean F Scale score for the 
group is probably too gross a division. For 
individuals not scoring in the extreme, high 
or low, authoritarianism is probably not the 
most crucial trait, The writer would speculate 
that for these individuals there are other 
ts, attitudes, and needs which play a more 
crucial role in determining the quality of 
their interpersonal relationships. 

It may also be observed that attitude items, 
the AET sort, have a greater immediacy or 
relevance to the therapy situation than F 
Scale items. Many AET items refer to atti- 
tudes or behaviors which are very soon con- 
spicuous by their presence or absence. By 
contrast, the F Scale measures a more funda- 
mental trait which may not express itself so 
immediately or directly. In spite of the care- 
ful manner in which the AET items were de- 
veloped, it may be that the sort contains sev- 
eral items of serious import to the develop- 
ment of the relationship, but not heavily 
loaded with authoritarianism. The method of 
deriving the Discrepancy Score, by summing 
the squares of the pile number differences over 
all items, gives an equal impact to all items. 

This discussion should not, however, ob- 
scure the finding that similar attitudes of 
therapist and patient toward therapy were re- 
lated to better therapy relationships. We are 
still some way from the point at which we 
can “match” patient and therapist to maxi- 
mize success in therapy. As a therapist, the 
writer doubts that research of this kind will 
ever take all of the “mystery” and the essen- 
tially personal quality out of psychotherapy. 
Research may, however, help us to avoid the 
more blatant difficulties, and thus permit the 
more individual aspects of psychotherapy to 
operate more effectively. 


108 John L. Vogel 


SUMMARY 


It was predicted that authoritarianism, as 
a personality trait of therapist and patient, 
would be reflected in their attitudes toward 
therapy and in their therapeutic behavior. 
Secondly, it was hypothesized that authori- 
tarianism and equalitarianism, as interacting 
personality traits of therapist and patient, 
would have specified effects upon the quality 
of the relationship established. 

A total of 62 patients and 49 therapists in 
two clinic populations completed the Cali- 
fornia F Scale and a specially devised in- 
strument in which they described the ideal or 
preferred therapeutic relationship. After the 
second interview these therapists completed a 
scale containing two criterion items: a rating 
of the quality of the relationship and an esti- 
mate of patient satisfaction with the relation- 
ship. In one of the two clinic settings, two 5- 
minute segments were selected from each of 
the first interview recordings. For each seg- 
ment, two judges rated the two criterion items 
and specific and general traits referring to au- 
thoritarian behavior on the part of the thera- 
pist and patient. 

Authoritarianism (as measured by the F 
Scale) was found to be related to authori- 
tarian attitudes toward therapy in both pa- 
tient populations and in one of the two thera- 
pist populations. The hypothesis that au- 
thoritarianism, as measured by the F Scale, 
would be related to authoritarian behavior in 
therapy was supported for the therapist popu- 
lation, but not for the patients. A test of the 
hypothesis that equalitarian patients would 
form better therapeutic relationships than 
authoritarian patients gave equivocal results. 
The second basic hypothesis, that similarity 
of therapist and patient on the specific di- 
mension of authoritarian—equalitarian would 
tend to facilitate the relationship, was not 
supported. There was, however, an associa- 
tion between criterion ratings and the amount 
of discrepancy between therapist and patient 
descriptions of the ideal or preferred relation- 
ship on items related to authoritarianism. 
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SOCIAL DESIRABILITY AND RESPONSE TO PERCEIVED 
SITUATIONAL DEMANDS 


DAVID MARLOWE 
College of Medicine, University of Kentucky 
Current research on social desirability 


(Cowen & Tongas, 1959; Edwards, 1957; 
Wiggins & Rumrill, 1959) has been chiefly 
concerned with a descriptive analysis of the 
influence of this variable on personality test 
responses. Along these lines, social desirability 
has achieved major status as & psychometric 
variable, the properties typically ascribed to 
it being those of a stylistic response determi- 
nant (Jackson & Messick, 1958). Pre-emi- 
nently, social desirability is considered to be 
a characteristic of test items (Edwards, 1957), 
and two models have been applied to its as- 
sessment., In the first of these procedures, 
items on a test are rated for social desirabil- 
ity by judges, and then responded to by sub- 
Jects (Ss) under standard instructions (Ed- 
wards, 1953; Rosen, 1956); the correlation 
of the two sets of responses is inferred to in- 
dicate the amount of test response variance 
accounted for by social desirability. The sec- 
ond model involves the development of ra- 
tional or empirical social desirability scales 
(Edwards, 1957; Hanley, 1957), the items of 
which show marked social desirability proper- 
ties. Correlations between these scales and 
Various personality tests, such as the MMPI, 
are assumed to reflect social desirability bias 
In the test responses. This method, aes 
Can also be employed to identify. dissimula- 
tors, i.e., those Ss whose personality test re- 
Sponses conform to the cultural stereotypes 
Fe ata by the social desirability scale 

iggins, 1959). io 

The Set conceptions of social desir- 
ability thus reflect an exclusive concern with 
response distortion in psychometric situations, 
with an attendant narrowing of research in- 
terests to investigations of the social desir- 
ability scalability of tes 


t items. The concept 
of social desirability has not been systemati- 
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cally investigated in terms of the motivation 
of Ss to dissimulate on personality tests and 
the relevance of this motivation to behavior 
in other, nontest situations.’ This latter con- 
ception suggests research in which the differ- 
ential influence of the need to respond in a 
socially desirable fashion would be investi- 
gated in situations where “self” or “item” 
evaluation is not the primary dependent vari- 
able. The present experiment was undertaken 
with this view in mind. 

In a recent report, the writers (Crowne & 
Marlowe, 1960) described the development 
and preliminary validation of a new social 
desirability scale (M-C SDS) and outlined 
the construct of which the scale is at present 
the sole operational definition. In the initial 
study, however, only the essentials of a mo- 
tivational concept of social desirability were 
suggested, and it is desirable here to present 
in further detail some of the implications of 
the construct. 

Social desirability, as presently defined, re- 
fers to a need for social approval and ac- 
ceptance and the belief that this can be at- 
tained by means of culturally acceptable and 
appropriate behaviors. In a psychometric 
situation, a high need for social approval 
would be inferred from a person’s attribution 


1Shortly after the completion of this experiment, 
Allison and Hunt (1959) reported a study investi- 
gating the relationship between Edwards SDS and 
aggressive responses to varying conditions of frustra- 
tion as measured by a paper-and-pencil test. They 
interpreted their results as indicating that “the [ag- 
gression] ‘suppressing’ effect of the SD factor occurs 
primarily in situations in which the culturally ac- 
ceptable response is not evident” (p. 532). While 
Allison and Hunt are careful to refer to social de- 
sirability as a “factor,” their recasting of Edwards’ 
concept as a “process” perhaps related to “other- 
directedness” implies a motivational usage similar to 
that of the present research. 
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of culturally approved statements to himself 
and the denial of culturally unacceptable 
traits. Most importantly, however, to assess 
the strength of social desirability motivation 
in a test situation one must be able to deter- 


mine the actual presence or absence of the. 


traits, characteristics, or symptoms that are 
denied by the individual. Clearly, a need for 
approval would not necessarily be implied by 
the failure to attribute socially disapproved 
characteristics to oneself when these charac- 
teristics are not actually descriptive of the 
individual. In the development of the M-C 
scale, a psychometric model was employed 
which avoids the ambiguities arising from the 
failure to consider the actual incidence of 
traits represented in the test items. Items 
were selected for the M-C SDS from a de- 
fined universe representing behaviors which 
are culturally sanctioned and approved but 
which are improbable of occurrence. 

A low need for social approval implies a 
degree of independence of cultural definitions 
of acceptable behavior. The person less mo- 
tivated by a need for social approval might, 
in a testing situation, acknowledge certain 
symptoms, reject them as personally irrele- 
vant, or present other test responses depend- 
ing on such factors as the strength of his 
present needs, the kinds of responses re- 
quired, and the nature of the test stimuli. 

The present need construct clearly implies 
that “social desirability” has considerable 
generality beyond self-evaluative or test situa- 
tions, and this study was undertaken to as- 
sess the construct’s utility for predicting in- 
dividual differences in response to perceived 
cultural definitions of appropriate behavior. A 
situation was required that would be per- 
ceived by Ss as demanding of certain socially 
acceptable behavior. If Ss were presented with 
a boring, repetitive task and required to per- 
form it for a considerable period of time, it 
seems probable that frustration would ensue 
and that negative attitudes would be expressed 
towards the task. Were this boring task to be 
presented by an experimenter (£) who con- 
spicuously played the role of university pro- 
fessor, authority figure, and omniscient psy- 
chologist in the presentation of the experi- 
ment and the elicitation of attitudes towards 
it, Ss with high social approval needs might 
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be expected to express more favorable (so- 
cially appropriate) attitudes than Ss less mo- 
tivated for approval. The spool packing task 
used by Festinger and Carlsmith (1959) 
seemed ideally suited for this purpose and, in 
slightly modified form, the attitude question- 
naire employed by them was deemed equally 
adequate. 

The definition of social desirability as a 
need for social approval and the belief that 
this can be attained by means of culturally 
acceptable behaviors would appear to overlap 
in some degree with the variable of conform- 
ity, and from the present definition of social 
desirability a relation with conformity would 
be predicted. The two concepts can be dif- 
ferentiated, however, in that the need for so- 
cial approval is a motivational variable, while 
conformity refers to a class of behaviors. Pre- 
diction of a relationship between social desir- 
ability and conformity assumes that conform- 
ity constitutes a category of behaviors avail- 
able to individuals seeking to gratify social 
approval needs. As regards this experiment, 
there is, nevertheless, a crucial question: 
would the two concepts differ in their utility 
for predicting the same behaviors? As a test 
of this the Independence of Judgment Scale 
(Barron, 1953), a paper-and-pencil measure 
of conformity, was included in the experiment 
to assess its value for predicting attitudes to- 
wards the spool packing task, 

Finally, since the present construct and its 
derived test differ from other definiti 
measures of social desirability, 
sults would not be expected fr 
cial desirability scales. Accordingly, Edwards 
(1957) SDS was incorporated in the pres- 
ent design to determine its ability to predict 


the favorability of attitudes towards spool 
packing. 


ons and 
the same re- 
om other so- 


, METHOD 
Subjects 


Fifty-seven under 
ductory psycholog 
tary basis in th 


aduate male students in intro- 
RY classes participated on a volun- 
e experiment. The experiment was 
conducted at the University of Kentucky (26 Ss) 
and n Ohio State University where 31 Ss were ob- 
tained. 7 


Procedure 


The experimental procedure was identical at the 
two universities except for the use of a different E 
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at each institution, and the administration of the Ed- 
wards SDS to 29 Ss at Ohio State only. The Ss were 
individually administered the spool packing task, a 
four-item questionnaire intended to elicit attitudes 
towards the packing task, the M-C SDS, the Bar- 
ron Independence of Judgment Scale and the Ed- 
wards SDS. Throughout the entire procedure, E 
maintained a professional and somewhat aloof man- 
ner, avoiding any conversation with S other than 
that necessary to conduct the experiment, The fol- 
lowing instructions were read to the S who was 
Seated at a table directly opposite E: 


My name is Dr. —————————_- Tm a psy- 
chologist and I’m conducting an experiment on 
measures of performance. Before we get started 
on the experiment, I would like you to fill out 
these questionnaires. Sign your name on all of 
them. 


The Ss then completed the following scales: 

1. The M-C SDS, which consists of 33 items with 
true or false response categories.” An illustrative 
item is: “I never hesitate to go out of my way to 
help someone in trouble.” 

2. Immediately following the M-C scale, S com- 
Pleted the Barron Independence of Judgment Scale, 
a 22-ilem questionnaire previously shown to be valid 
for discriminating male conformists from male non- 
conformists in an “Asch-type” situation (Barron, 
1953; Tuddenham, 1958). An illustrative item 1s: 
“It is easy for me to take orders and do what Tam 
told? ` 

3. At Ohio State Univ 
Edwards SDS after the Ba 
comprising the Edwards 
the MMPI ae K scales and from the Taylor 

anifest Anxiety Scale. ; 

When $ completed the last scale, he was told: 
The materials are 
ant you to take 


>, 29 Ss completed the 
ron scale. The 39 items 
e were obtained from 


naow for the experiment ae 
is box and the 12 spools. I W: 3 
these pick ote at a tie, and place them in 7 
box, When you are finished, empty the box a 
refill it one “spool at a time. Continue to nil and 
empty the box until I tell you to stop. Use one 
and, and work at your own preferred speed. 


ed the box for 25 minutes 


S the kai ear, 
en packed and unpack ack DA, conspicuously 


While E } Par 
E held a stopwatc | c t 

Pretending to be busily engaged in maeng notes on 
S performance, After 25 minutes, Æ said, 

in the experiment itself. 


a thate aiae fg get a chance to see 


Ih i it. You 
Ope you enjoyed it. You Fy sould 

how you react to the task and so forth. I va 5 

like to know what your personal reactions baa 

the task and the experiment. W ould you ans 

this questionnaire? 

experiment by 


The he seat o the 
S then rated his reactions t taken from 


powering the following four questions, 
estinger and Carlsmith (1959): 
=a 


; í f-C sS ay be 
A complete description of the M-C scale may be 


foung in Crowne and Marlowe (1960). 
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1. Was the task interesting and enjoyable? 
Would you rate how you feel about the task on 
the scale below where —5 means extremely dull 
and boring, +5 means the task was extremely in- 
teresting and enjoyable, and 0 means the task was 
neutral, neither interesting nor uninteresting. 

2. Did the experiment give you an opportunity 
to learn about your abilities and skills? Rate how 
you feel about this on a scale from O to 10 where 
O means you learned nothing and 10 means you 
learned a great deal. 

3. From what you know about the experiment, 
and the task involved in it, would you say the ex- 
periment was measuring anything important? That 
is, do you think the results may have scientific 
value? Rate your opinion on this matter on a 
scale from O to 10 where O means the results have 
no scientific value or importance and 10 means 
they have a great deal of value and importance. 

4. Would you have any desire to participate in 
another similar experiment? Rate your desire to 
participate in a similar experiment again on a scale 
from —5 to +5, where —5 means you would defi- 
nitely dislike to participate again, +5 means you 
would definitely like to participate again, and 0 
means you have no particular feeling about it one 
way or the other. 


The three scales, the spool packing task, and the 
spool packing questionnaire were presented in two 
orders for the purpose of controlling the possible in- 
fluence of a sequence effect. Half of the Ss packed 
the spools first, answered the four questions, and 
then completed the various scales, while the other 
half completed the three scales first and were then 
administered the task and the four questions. The 
instructions to S were modified in accord with the 
order of presentation used. Ss in the two conditions 
did not differ significantly with respect to means or 
variances on any of the measures, and the data were 
therefore analyzed without regard for the order in 
which the tasks were presented. 


RESULTS 


As an initial step, the Ohio State and Ken- 
tucky Ss were compared with respect to 
means and variances on all the measures. No 
significant differences were obtained and the 
final analysis of the data was therefore based 
on the combined N of 57. It should be noted 
that significant results similar to these to be 
reported below were obtained when statistical 
analyses were carried out separately for the 
Kentucky and Ohio State samples. Thus, the 
findings that follow represent, in essence, the 
pooled results of a replicated experiment. 

In the major hypothesis of the study, it 
was predicted that individuals with a high 
need for social approval would express more 
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TABLE 1 


DIFFERENCES BETWEEN Hicn anp Low M-C 
SD Groups IN EXPRESSED ATTITUDES 


High 
(N =30) 
Mean 


Low 
(N =27) 


Question Mean 


How enjoyable tasks were 
(rated from —5 to +5) 
How much they learned 
(rated from 0 to 10) 
Scientific importance 
(rated from 0 to 10) 
Participate in similar 
experiment (rated from 
—S to +5) 3.63 1.67 
- = 


** p <.01; one-tailed test. 


favorable attitudes towards the spool packing 
task than Ss whose needs for social approval 
are relatively weaker. To test this hypothesis, 
Ss’ scores on the M-C SDS were dichotomized 
at the mean (14.93) to yield a high SD group 
of 30 Ss, and a low SD group of 27 Ss. 
Scores of the high group ranged from 15-29, 
while those of the low group were from 5-14. 
The differences between the mean ratings 
given to the four attitude questions by the 
two groups were tested for significance by 
means of ¢. The results of this analysis are 
contained in Table 1. 

Inspection of Table 1 indicates that the 
two groups differed significantly in mean rat- 
ings on each of the four questions. These dif- 
ferences are all in the predicted direction with 
the high SD group expressing significantly 
more favorable attitudes towards the experi- 
mental task than the low group. These find- 
ings support the general hypothesis that indi- 
viduals with a strong need for social ap- 
proval are significantly more likely to express 
attitudes congruent with perceived situational 
demands than individuals with a lesser need 
for social approval. 

To assess the relationship between scores 
on the Barron Independence of Judgment 
Scale and attitudes towards the spool packing 
task, an analysis similar to that carried out 
for M-C SDS was performed. Scores on the 
Barron scale were dichotomized at the mean 
(10.39) to yield a high conformity group 
(N = 31), and a low conformity group (N = 
26). Scores in the low group ranged from 2 
to 10, while scores in the high group ranged 
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from 11 to 20. The mean ratings given to the 
four questions by the two groups were then 
compared. 

The findings reported in Table 2 indicate 
only one significant difference between the 
mean ratings given by the two groups. On 
Question 2, the ratings given by the high con- 
formity group as to how much they learned 
about their abilities and skills were signifi- 
cantly higher (t= 2.02, p < .05) than the 
ratings assigned by the low conformity group. 
This single significant difference out of a pos- 
sible four, indicates that the conformity vari- 
able has limited utility for differentiating in- 
dividuals in the favorability of expresed atti- 
tudes towards the spool packing task. 

The Edwards SDS, purported to be a meas- 
use of test-taking defensiveness—i.e., a meas- 
ure of a non-test-relevant response determi- 
nant—was included in the study as a contrast 
to the present motivationally defined con- 
struct. Scores on the Edwards scale were di- 
chotomized at the mean (32.34), and a high 
group containing 14 Ss (range of 33-39) and 
a low group containing 15 Ss (range of 24- 
31) were obtained. 

The significance of the differences between 
the mean ratings given by the high and low 
Edwards SD groups to the four questions 
was also measured by ¢ tests. Table 3 pre- 
sents these data, and indicates that no sig- 
nificant differences were obtained, with the 
four #’s clustering around a value of 0. Quite 
clearly, social desirability as measured by the 
Edwards scale is unrelated to attitudes to- 
wards the experimental task. 


TABLE 2 


DIFFERENCES BETWEEN HIGH AND Low Con- 
FoRMITY GROUPS IN EXPRESSED ATTITUDES 


akh ) Low 
. N =31) (N =26 
Question Mean Mem D ! 
How enjoyable tasks were 

(rated from —Sto 45) 1.31 mo 92l Ag 
How much they learned 

(rated from 0 to 10) 5.27 Ta eA 
Scientific importance 

(rated from 0 to 10) 6.58 6.55 ar Mi 
Participate in similar 

experiment (rated fı 

pennt ed from ai 2.29 20 1.13 


* p < 05; one-tailed test. 


Social Desirability and Situational Demands 


As a final step in the analysis of the data, 
the intercorrelations between the M-C, Ed- 
wards, and Barron scales were computed to 
determine the extent to which scores on these 
scales are related to each other. Table 4 con- 
tains the results of this analysis. 

Inspection of Table 4 reveals that M-C SD 
scores are significantly correlated with both 
Edwards SD scores (r= -56, N = 29) and 
with conformity scores (7 = — -54, N = 57). 
Scores on the Edwards scale are uncorrelated 
with scores on the Barron scale (r= — 12; 
N = 29). We may conclude that individuals 
with a high need for social approval (M-C 
SD) tend to deny the symptoms and com- 
plaints represented in the Edwards SD items, 
and that a high need for social approval is 
also characteristic of individuals who give re- 
sponses on the Barron scale indicative of a 
relative lack of independence of judgment. 


Discussion 


The major purpose of this study was to 
assess the utility of treating the construct of 
social desirability as a motivational variable 
applicable over a range of situations, in con- 
trast to the usual approach of employing 
measures of social desirability solely to ac- 
count for non-test-relevant response variance 
on personality questionnaires. That social de- 
Sirability scales designed to measure a spe- 
cific test-taking attitude can account for a 
Portion of the variance in responses to per- 
sonality tests has been amply demonstrated 
(Edwards, 1957; Fordyce, 1956; Wiggins, 
1959), There has been a general failure, how- 


TABLE 3 


ren HicH AND Low 


DIFFERENCES BETW: 
plete ee s IN EXPRESSED 


Epwarps SD GROUP: 


ATTITUDES 
High Low 
7 =15 
wai) en Dit 
Question Mean 
How enj 
joyable tasks were mti 06 
n rated from —5 to +5. 36 aah 
Tow much 
h they learned —.14 12 
(rated from 6 to 10) 3.79 aza 
Scientific importance 5.87 w H 


rated from 0 to 10) 


Partici. 
ticipate in similar 
Xperiment (rated from 2.07 2.07 0 0 
=5 to +5) $ 
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TABLE 4 


CORRELATIONS BETWEEN M-C SD, EDWARDS 
SD, AND CONFORMITY SCALES 


Edwards SD M-C SD 


—.12 (WV = 29) 
.56** (N = 29) 


Conformity 
M-C SD 


—.54** (N = 57) 


**p <.01, 


ever, to consider the possibility that the dis- 
position to dissimulate in a test situation may 
be an expression of a generalized need to seek 
social approval. The findings of this study 
provide clear support for a theoretical ra- 
tionale which views social desirability in mo- 
tivational terms, regarding it as a need for 
social approval accompanied by a belief or 
expectancy that this need can be satisfied by 
engaging in culturally and situationally sanc- 
tioned behaviors. 

As predicted, the attitudes of the high M-C 
SD groups were significantly and uniformly 
more favorable toward the experiment than 
those of the low M-C SD group. We would 
suggest that the Es, as a consequence of their 
prestige and mildly authoritative manner (re- 
flected in their title, occupation, and behav- 
ioral aloofness), were perceived by the high 
M-C SD Ss as persons whose favor was worth 
courting. Consequently, these high M-C SD 
individuals were strongly motivated to yield 
to the demands of the situation: i.e., to tell 
the E that his experiment was interesting, 
important, personally informative, and worth 
returning to. In contrast, individuals less 
strongly motivated for social approval were 
better able to resist stating what seemed so- 
cially appropriate and to offer instead more 
realistic appraisals of the experiment. Pre- 
sumably, the less favorable opinions of the 
low M-C SD Ss reflect, in part, the greater 
freedom of this group from social pressures 
in the formulation and expression of their 
opinions. The significant correlation of —.54 
obtained between M-C SD and conformity 
would seem to support this formulation. 
Scores on the Barron scale, however, did not 
serve to discriminate the favorability of ex- 
pressed attitudes towards the boring task as 
well as M-C SD scores. Although one might 
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find certain similarities at a definitional level, 
we would conclude that the need for social 
approval and conformity (as measured by 
the Barron scale) are not by any means 
identical concepts. In terms of the present 
experimental evidence, conformity is perhaps 
best conceptualized as defining a class or 
mode of behaviors in which individuals with 
a strong need for social approval may engage 
in a particular situation. 

The Edwards scale had no utility whatso- 
ever for predicting differences in attitudes to- 
wards the experiment. In a situation where 
self-evaluation is not a relevant factor, the 
Edwards scale appears to be of little value in 
the understanding of motivational determi- 
nants of behavior. This is hardly surprising 
when one recalls that the items included in 
the Edwards scale refer almost exclusively to 
the presence or absence of symptoms and 
complaints, with a consequent restriction of 
the behaviors that are represented in the item 
content. By way of contrast, items for the 
M-C SDS were selected with the intent that 
they be referents of a construct explicitly de- 
fined in motivational terms. 

Moreover, scores on the Edwards scale 
were uncorrelated with conformity scores (7 = 
— .12), a finding which suggests, when added 
to other data recently reported (Wiggins, 
1959; Wiggins & Rumrill, 1959), that the Ed- 
wards scale may not be a “pure” measure of 
test-taking attitudes. To date, very high cor- 
relations have been reported between the Ed- 
wards scale and various MMPI scales and 
between the Edwards scale and the Taylor 
Manifest Anxiety Scale (Crowne & Marlowe, 
1960; Edwards, 1957; Wiggins, 1959). In 
contrast to these findings, considerably smaller 
correlations have been reported between the 
Edwards scale and tests less related to per- 
sonal adjustment (Crowne & Marlowe, 1960). 
It seems reasonable to suggest that the Ed- 
wards scale measures the extent to which an 
individual is willing to admit to symptoms in- 
dicative of maladjustment. Thus. we may ex- 
pect substantial relationships between the Ed- 
wards scale and other measures when there is 
a corresponding overlap in item content (par- 
ticularly that related to psychopathology). 

The present study may be viewed as an at- 
tempt to delineate elements in the nomologi- 
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cal net surrounding a defined construct of so- 
cial desirability. It seems quite apparent that 
the “meanings” which may be attached to the 
Edwards scale as a measure of social desir- 
ability are limited in scope and differ in ma- 
jor respects from the demonstrated and im- 
plied meanings of the present conception. The 
findings with respect to the need for social 
approval strongly support the hypothetical 
properties ascribed to it. As Cronbach and 
Meehl (1955) have noted, successful predic- 
tions with diverse criteria support the claim 
of construct validity more forceably than do 
predictions involving very similar behaviors. 


SUMMARY 


An attempt was made to assess the utility 
of defining the construct of social desirability, 
in motivational terms, as a need for social ap- 
proval. A new social desirability scale previ- 
ously developed to measure this variable was 
administered to subjects at two universities. 
For comparative purposes, the Edwards So- 
cial Desirability Scale and the Barron Inde- 
pendence of Judgment Scale were also in- 
cluded in the study. 

Subjects performed a boring task for 25 
minutes, and then rated their attitudes to- 
wards the experiment. The major hypothesis 
of the study predicted that individuals with 
a strong need for social approval would ex- 
press significantly more favorable attitudes 
towards the experiment than individuals with 
a relatively weak need for social approval. 
The significant findings reported confirmed 
this prediction. Scores on the Edwards and 
Barron scales were not significantly related 
to the favorability of the subject's attitudes. 

The overall results were interpreted as con- 
tributing to the delineation of the properties 
which may be attached to two current defini- 
tions of the social desirability variable. 


REFERENCES 
ALLISON, J, & Hex, D. E. Social desirability and 
the expression of aggression under varying condi- 


tions of frustration, J. consult. Psychol., 1959, 23, 
528-5 i 


BARRON, 


. Some personality correlates of independ- 
ence of judgment. J. Pers., 1953, 21, 287-297. 

Cowen, E. Ly & Tox P. N. The social desir- 
ability of trait descriptive terms: Applications tO 
a self-concept inventory. J. consult. Psychol., 1959 
23, 361-365. 


ka 
Y 


Social Desirability and Situational Demands 


Cronpacn, L. J, & Mrent, P. E. Construct validity 
in psychological tests. Psychol. Bull, 1955, 52; 
281-302. 

Crowne, D. P., & Marlowe, D. A new scale of so- 
cial desirability independent of psychopathology. 
J. consult. Psychol., 1960, 24, 349-354. 

Epwarps, A. L. The relationship between judged de- 
sirability of a trait and the probability that the 
trait will be endorsed. J. appl. Psychol., 1953, 37, 
90-93. 

Epwarps, A. L. The social desirability variable in 
personality assessment and research, New York: 
Dryden, 1957. 

FESTINGER, L., & Cartsmitn, J. M. 
quences of forced compliance. 
Psychol., 1959, 58, 203-210. 

Forpyce, W. E. Social desirability in the 
consult. Psychol., 1956, 20, 171-175. p 

Haney, C. ‘Deriving a m re of test-taking de- 
fensiveness, J. consult. Psychol, 


Cognitive conse- 
J. abnorm. soc. 


MMPI. J. 


1957, 21, 391-397. 


115 


Jackson, D. N., & Messick, S. Content and style in 
personality assessment. Psychol. Bull, 1958, 55, 
243-252. 

Ros E. Self-appraisal, personal desirability, and 
perceived social desirability of personality traits. 
J. abnorm. soc. Psychol., 1956, 52, 151-158. 

Tuppenuam, R. D. Studies in conformity and yield- 
ing: VII. Some correlates of yielding to a dis- 
torted group norm. ONR tech. Rep., 1958, No. 8. 
(Contract NR 170-159) 

Wiccixs, J. S. Interrelationships among MMPI 
measures of dissimulation under standard and so- 
cial desirability instructions. J. consult. Psychol., 
1959, 23, 419-427. 

Wicorxs, J. Sẹ & Rue, C. Social desirability in 
the MMPI and Welsh’s factor scales A and R. J. 
consult. Psychol., 1959, 23, 100-106. 


(Received February 11, 1960) 


Journal of Consulting Psychology 
1961, Vol. 25, No. 2, 116-122 


ATTITUDES TOWARD SEX ROLES AND FEELINGS 
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Many writers have seen homosexuality as 
either a concomitant or a cause of maladjust- 
ment and neurosis, although a few theorists 
have felt that homosexuals may be either 
adjusted or maladjusted. Evelyn Hooker's 
(1957) empirical results suggest that there is 
some justification for thinking that homo- 
sexual males may vary in their degree of ad- 
justment. A major assumption underlying this 
research is that homosexuals, as well as het- 
erosexuals, may, indeed, be differentially ad- 
justed, and this study attempts to investigate 
factors that may be associated with varying 
degrees of adjustment in the homosexual male. 

Factors which may be important are sug- 
gested by Bennett and by Hooker. Bennett 
(1947) emphasizes the fact that the hetero- 
sexual majority in a disapproving society 
tends to increase the homosexual’s sense of 
isolation through its selective treatment of 
him. The heterosexual is able to choose freely 
his social isolation; the homosexual is not. 
The only emotional surcease he may find is 
in the company of others like himself. In an- 
other article, Hooker (1956) agrees with 
Bennett that the homosexual male’s adjust- 
ment may be greatly facilitated by associa- 
tion with others like himself, and by adopt- 
ing the standards of the homosexual group. 
Following this line of thinking, one might be 
led to expect that the person who has many 
homosexual contacts or associations would be 
the most satisfied with his status. The op- 
posite would be expected of the homosexual 
male who is forced to maintain heterosexual 


1 This paper is based upon a master’s thesis sub- 
mitted to the University of Colorado and was par- 
tially supported by a grant from the Graduate 
School of that institution. The data for it were col- 
lected while the author was a United States Public 
Health Fellow under Training Grant M-6613. 


contacts and associations. This reasoning is 
subsumed here under a “role conflict” con- 
cept; role conflict can be said to exist when 
the individual is required by the social de- 
mands of a particular situation to behave in 
a manner incongruent with his normal, self- 
accepting role—assuming, of course, that he 
does accept the role of homosexuality for him- 
self. Conflicts of this nature might be ex- 
pected to exist, for example, in an individual 
who is employed in a job where he must dis- 
play the characteristics of a typical hetero- 
sexual male. 

Besides this sort of frustration and thwart- 
ing that the homosexual male may have to 
face in a very real, objective sense, he may 
also face conflicts which arise in his percep- 
tions of his role. In order to continue build- 
ing a set of logically consistent hypotheses, 
this study made the assumption that the per- 
son feels most comfortable and assured if he 
identifies with the role of the typical homo- 
sexual male, whatever he may perceive this 
role to be. A homosexual can therefore face 
a subjective, cognitive role conflict if he per- 
ceives a greater discrepancy between himself 
and the typical homosexual male than be- 
tween himself and the typical heterosexual 
male. This individual is one who feels that 
the qualities and attributes he possesses are 
closer to those characteristic of the typical 
heterosexual male than those characteristic of 
the typical homosexual male. 

The terms “adjustment” and “maladjust- 
ment” can have a variety of meanings. Per- 
haps more satisfactory operations for these 
terms could be provided by other measures, 
but these were precluded by the scope of the 
Present study. Instead, subjective measures, 
relating to the person’s feelings about him- 
self—his status, his sense of well-being, ade- 
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quacy, etc—were used. In order to avoid con- 
fusion in terminology, future reference will be 
made to “feelings of adequacy” instead of to 
“adjustment.” 

In this research, it is expected that feelings 
of adequacy will be greater in those who: 

1. Have homosexual contacts and associa- 
tions (“Contacts and associations” here in- 
clude leisure-time activity, “homosexual mar- 
riage,” and membership in homosexual groups 
and organizations.) 

2. Suffer fewer pressures toward hetero- 
sexual behavior and attitudes (“Pressures to- 
ward heterosexual behavior and attitudes” are 
presumed to be found in masculine or con- 
flictful types of emplyoment. A further indi- 
cation of pressures is also inferred from an 
individual’s willingness to reveal his homo- 
sexual status to friends, relatives, and work 
associates, and nonpreferred contact with het- 
erosexuals.) 

3. Identify with the typical homosexual 
male oe 

4. Perceive fewer desirable characteristics 
in the role of the typical heterosexual male 


5. Perceive a smaller discrepancy between 


themselves and the typical homosexual male 
e typical het- 


than between themselves and th 
erosexual male a: 

It should be noted here that the conviction 
with which these hypotheses were made was 
tempered by the fact that little research in 
this area has been done, and that Hooker and 
Bennett’s suggestions, upon which only Hy- 
Pothesis 1 is based, do not rest on a large 
body of experimental evidence. Failure to 
Verify the hypotheses should not, au pa 
necessarily construed as & failure or s Dia 
coming of the admittedly naive ideas an 
theories on which they are based. 


PROCEDURE 


j bjects (Ss) was used in 
e Apter ne contacted through 


this study, The larger portion v 
. easipergefat of tie Denver and the an ae 
hapters of the Mattachine Society, a na al 

Sanization concerned with problems of aK pl 

ment.? Anonymity of the Ss was preserve ee 

less, numbered questionnaires. In order to ta 


Tetesting of some Ss at a later date, a contact indi- 
ar . 
? The author is grateful to the members and friends 


Of this society for their cooperation, and to William 
- Scott and Evelyn Hooker for their criticism. 
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vidual in each of these cities maintained a list of Ss 
and their corresponding questionnaire numbers. Due 
to the nature of the sampling problems involved in 
this type of research, the representativeness of the 
sample is probably as good as can be achieved un- 
der the present circumstances. The contacts were in- 
structed to sample as many diverse elements of their 
respective homosexual populations as possible in or- 
der to further the aim of representativeness. 

For the purpose of this study, the criterion of 
homosexuality was defined by the Ss themselves. 
They were deemed “homosexual” if they were will- 
ing to label themselves as such to the contact person. 
Some individuals volunteered the information that 
they were predominantly bisexual, or preferred to 
think of themselves in this manner. However, as far 
as the investigator was able to ascertain, no S claimed 
to enjoy sexual relations with the opposite sex more 
than he enjoyed sexual relations with the same sex. 

Ss ranged in age from 21 to 63 years, with a mean 
of 34 years. They had been aware of their own 
homosexuality from 2 to 45 years, and the mean 
time they have been aware of their status is 17 years. 
Eleven considered themselves “homosexually mar- 
ried,” but only one was heterosexually married. A 
great diversity of occupations was represented, and 
the residences of the Ss also varied. 

The data for this study were collected by means 
of a paper-and-pencil questionnaire. Fifteen Ss were 
group tested in Denver at a special meeting of the 
Mattachine Society, which publicized the session as 
widely as possible. At this time other blank ques- 
tionaires were handed out to interested individuals 
who thought they could contact friends, acquaint- 
ances, etc. who would be willing to complete the 
test and forward it directly to the writer. Stamped, 
self-addressed envelopes were provided for this pur- 
pose so that the completed questionnaire need not 
pass through the hands of a third party. 

Another 12 questionnaires were individually ad- 
ministered in San Francisco through the aid of the 
contact who is employed as a minister-counselor in 
the Mattachine office of that city. He endeavored to 
give the test to the first 12 homosexuals who came 
to his office. The remaining 20 questionnaires were 
returned from various sources, presumably filled out 
by friends or acquaintances of the original group 
of 15. 

Measurement of Feelings of Adequacy. Feelings of 
adequacy were measured by the following two de- 
vices: 

1. The self-ideal discrepancy. S was asked to rate 
each of 46 traits according to how well it described 
“himself as he is now,” on a seven-point scale with 
1 indicating “exactly like.” This set of ratings de- 
scribed the self-concept, and the same procedure was 
followed in ascertaining the ideal-self-concept, ex- 
cept that S was asked to rate each of the same words 
according to how he “would like to be.” It was as- 
sumed that the larger the sum of the absolute dis- 
crepancies in ratings of these traits under the two 
different sets, the greater the feelings of inadequacy 
in the S. Examples of these traits and the rationale 


1137 = 
TABLE 1 
COEFFICIENTS OF STABILITY 
(N = 19) 
Measure r 


Self-Het. Discrepancy 
Ideal-Het. Discrepancy 
Ideal-Hom. Discrepancy 
Self-Hom. Discrepancy 
Ideal-Self Discrepancy 

Direct Measure of Self-Adequacy 


for selecting the particular set used are described in 
more detail in the section on “Measurement of Sub- 
jective Role Conflict.” 

2. The direct measure. Twenty statements of the 
MMPI type, referring to S’s feelings of adequacy, 
were prepared and tested for homogeneity on a pilot 
sample of 45 General Psychology students. S was 
asked to rate each of the statements on a seven-point 
scale according to how much of the time he thought 
it applied to himself. Examples of these statements 
are: “I am entirely self-confident,” “I certainly feel 
useless,” and “I feel that I am a stable person,” 

Measurement of Objective Role Conflict. Objective 
role conflict was said to exist for a homosexual male 
employed in a job where he must display charac- 
teristics of a typical heterosexual male. Thus this 
study assessed Ss’ occupations, and these were rated 
by the investigator and another judge according to 
whether or not they seemed likely to pose conflict 
for the individual. “Conflict” was defined by two 
criteria: requiring the individual to behave in a man- 
ner characteristic of a typical heterosexual male, and 
assumption of this role in a type of employment 
which also required frequent contact with a pre- 
dominantly heterosexual public. A coding of 1 was 
made to indicate absence of conflict, 2 indicated that 
an ambiguous or neutral occupation was held, and 
3 indicated the likelihood of sex role conflict. A self- 
employed bookshop operator, a student, and an artist 
were, for example, rated 1; and a railroad engine- 
man, an engineering geologist, and a lawyer were 
rated 3, 

Measurement of Subjective Role Conflict. These 
measures were developed from lists of the same traits 
as were used to determine the self-ideal discrepancy. 
In other parts of the questionnaire S was asked to 
rate each of the 46 traits according to how well it 
described the “typical male homosexual,” the “typi- 
cal female heterosexual,” and the “typical male het- 
erosexual.” The same seven-point scale was utilized 
as before. The rationale for selecting the specific traits 
used rests upon articles like that of Parsons (1956), 
whercin some suggestions are made concerning dif- 
ferentiating qualities assigned to men and women in 
this culture, and upon intuitive hunches regarding 
traits which may be thought of as characteristic of 
men, women, and male homosexuals. Some of the 
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descriptive adjectives were fillers, but the majority 
were selected to represent diverse areas of human be- 
havior, regarded in popular stereotypes as charac- 
teristic of these three groups. Some illustrative ex- 
amples are: “able to get along with everybody,” “ag- 
gressive,” “ambitious,” “creative,” “flighty,” “fault 
finding,” “intellectual,” “irresponsible,” “mature,” 
“self-centered,” “sociable,” and “talented.” 
Reliability of the Measures. Nineteen of the origi- 
nal Ss were recontacted and administered a shortened 
form of the questionnaire approximately 3.5 to 4 z 
months after the original test. Coefficients of sta- 
bility were then calculated for the discrepancies be- 
tween the self-concept and the typical heterosexual 
male, the ideal-self and the typical heterosexual male, 
the ideal-self and the typical homosexual male, the 
self and the typical homosexual male, and the ideal- 
self and the self. A coefficient of stability was also 
calculated for the direct measure of adequacy. These 
product-moment correlations are shown in Table 1. 
Split-half reliability coefficients were calculated on 
the total original sample for the two measures of 
adequacy. These, corrected for length by the Spear- 
man-Brown formula, are .96 for the self-ideal dis- 
crepancy, and .91 for the direct measure, 
Indications of the Validity of the Measures. An 
indication of the validity of the two measures of self- 
adequacy w ved at by correlating the scores be- 
tween the direct measure and the self-ideal discrep- 
ancy. The obtained product-moment 7 of .53 was 
judged sufficient to allow the use of both as repre- 
senting feelings of adequacy in this study. 
Some support for the validity of certain of the 
other discrepancy measures is also provided by data 


TABLE 2 


Raw Score Discrepancies FOR THE SE 


LF Less 
TypicAL HOMOSEXUAL MALE, H UAL 
ALE, AND H SXUAL MAL MEAN f 
Raw Score DISCREPANCIES FOR THE TYPICAL 
HomosexuaL MALE Less THE TYPICAL 


HETEROSEXUAL F AND 


H ROSEXUAL Mat 
Discrepancy Mean SD ta 
Self-Hom. 61.6 21.7 
—2.79* 
Self-Fem. 68.3 239 ; 
3.06" f 
Self-Het. 75.8 27.8 sea ; 
Hom.-Fem. 57.6 25.6 
—6.26** d 
Hom.-Het. 73.2 28.2 ty 
Note—Total N = 
patte Total N =a7. | 
is 
ratings he gave 
on the page whi 


male. 
*¢ of correlated differe 
ces, 
a b<0, two-tailed te: : 
“* p <.001, two-tailed test, 
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from the original sample of homosexuals. Mean raw 
score discrepancies were calculated for the following: 
the self-typical homosexual male (self-hom.), the 
self—typical heterosexual female (self-fem.), the seli- 
typical heterosexual male (self-het.), the typical 
homosexual male-typical heterosexual female (hom.- 
fem.), and the typical homosexual male-typical het- 
erosexual male (hom.-het.). These differences be- 
tween mean discrepancies were then tested for sig- 
nificance (Table 2). It is readily apparent that the 
average self is scen as more like the typical homo- 
sexual male than like the typical heterosexual fe- 
male, and more like the typical heterosexual female 
than like the typical heterosexual male. It is also ap- 
parent that the typical homosexual male is perceived 
as more like the typical heterosexual female than like 
the typical heterosexual male. These results corre- 
spond with common assumptions concerning the rela- 
tive similarities of homosexuals and typical males 
and typical females; hence, they lend a certain de- 
gree of confidence to the discrepancy measures on 
Which they are based. 


RESULTS 


The major portion of the data analysis was 
carried out in the following manner. The raw 
self-adequacy scores on both measures and 
the various raw discrepancy scores were Or- 
dered in a frequency distribution and in- 
Spected to find equal intervals which would 
Cover all the distributions of the total num 
ber of discrepancies to be used in the analy- 
Sis. It was decided to use six intervals, in or- 
der to facilitate punching the data into a 
single IBM card column, and yet allow a 
sizeable number of Ss in each group. Unless 
Otherwise noted, all analyses to be reported 
in this section are based on these group scores 
rather than on the raw scores. It should also 

€ noted (a) that the smaller the group a S 

€g., 1 as opposed to 6, the smaller the a pr 
ute discrepancy, and (b) that in keran 
he self-adequacy measures, the sma F ca 
Stroup scores, the greater the feelings of ade 
quacy, 


Results Relating to Objective Role Conflict 


Hypothesis 1 stated that the more homo: 
Sexual contacts and associations er ad 
m by the S, the greater would be his fee ings 
a adequacy. To test this, the following i 
erations were performed. Mean teeline E 
pdequacy on both measures were calculated 
or: 


l. Ss associating predominantly with other 


TABLE 3 


MEAN FEELINGS OF ADEQUACY FoR SUBJECTS Wno 
© EITHER “HoMOsEXUALLY MARRIED” OR 
“HOMOSEXUALLY UNMARRIED” 


N Mean SD t 


Direct Measure 


Homosexually Married 11 282 147 

Homosexually Unmarried 36 3.36 1.02 1.149 
Self-Ideal Discrepancy 

Homosexually Married t. 218 = 1,33 2.956" 

Homosexually Unmarried 35 3.51 127 ^ 56 


* p< 01, two-tailed test. 


homosexuals and those associating predomi- 
nantly with heterosexuals 

2. Ss belonging to homosexual social groups 
or organizations, and those belonging to 
neither 

3. Ss who considered themselves homo- 
sexually married and those who did not 

No significant differences were found in the 
first two operations; however, the differences 
between the mean self-adequacy scores reached 
significance on one measure in the third op- 
eration (Table 3). Apparently homosexual 
males can be said to feel more adequate if 
they are homosexually married. 

To test Hypothesis 2 that feelings of ade- 
quacy will be greater in Ss who suffer fewer 
pressures toward heterosexual behavior and 
attitudes, the following operations were per- 
formed. Mean feelings of adequacy on both 
measures were calculated for: 

1. Ss who were rated as being in noncon- 
flictful jobs, ambiguous jobs, and conflictful 
jobs 

2. Ss expressing much satisfaction (rating 
of 1) with nonconflictful jobs, ambiguous 
jobs, conflict jobs, and for Ss expressing less 
satisfaction (ratings of 2, 3, and 4) with the 
three job categories 

3. Ss expressing preference for contact with 
homosexuals and actual leisure-time associa- 
tion predominantly with heterosexuals, pref- 
erence for contact with homosexuals and 
actual leisure-time association predominantly 
with homosexuals, preference for contact with 
heterosexuals and actual leisure-time associa- 
tion perdominantly with heterosexuals, and 
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then takes on two meanings: “deviancy” in 
the sense that all Ss have departed from the 
cultural norm of preferring a heterosexual fe- 
male for a sex partner; and “deviancy” in the 
sense that some Ss have also chosen to reject 
the prescribed sex role—that of the typical 
heterosexual male. 

Since a comparison of mean self-adequacy 
scores on the direct measure for this homo- 
sexual male sample vs. a random sample of 
male students at the University of Colorado * 
had yielded no statistically significant differ- 
ence, we are led to conclude that the former 
meaning does not necessarily lead to feelings 
of inadequacy, while the latter meaning does, 
since this research has finally found that those 
homosexual males who see the prescribed sex 
role as uncongenial are those who are also in- 
adequate. But those homosexual males who 
do adhere to the cultural standards of feel- 
ing, perceiving, emulating, and idealizing the 
typical heterosexual male are more likely to 
feel self-satisfied and adequate. 

Of course, all of these tentative interpreta- 
tions must be dealt with cautiously. The sam- 
ple used in this study was in no way ran- 
domly selected, and the generalizability of the 
results to the total homosexual male popula- 
tion is, therefore, questionable. 


SUMMARY 


This research was a study of feelings of 
adequacy in homosexual males. Designed to 
test a set of simple hypotheses, the study 
focused on objective and subjective role con- 
flict as possible factors relevant to homo- 
sexual males’ feeling of adequacy. 

Forty-seven homosexual males were anony- 
mously assessed by means of paper-and-pencil 
questionnaires. Feelings of adequacy were 
measured by: the discrepancy between the 
self- and the ideal-self-concepts on a list of 
46 traits deemed relevant to the homosexual 
male’s status; and responses to a list of 20 


3The University of Colorado cross section was 
studied by Robert Kassebaum and Leon Rappaport, 
under the direction of William A. Scott. 
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MMPI-type statements presumably reflecting 
feelings of adequacy. The correlation between 
these two measures of adequacy was .53. In- 
formation relative to objective role conflict 
was obtained through answers to questions 
regarding the subject’s status from which one 
could infer pressures toward heterosexual be- 
havior and attitudes. Subjective role conflict 
was assessed by means of ratings assigned to 
the same 46 traits as were utilized in the self- 
ideal discrepancy, with separate ratings ob- 
tained for the “typical homosexual male,” the 
“typical heterosexual female,” and the “typi- 
cal heterosexual male.” 

The findings did not, in general, support 
the role conflict hypotheses. Instead, the re- 
sults could more readily be interpreted as in- 
dicating that subjectively adequate homo- 
sexual males were those who tended to identify 


with the masculine norms of the dominant — 


culture. Feelings of adequacy were associated 
with: job satisfaction, preference for leisure- 
time association with heterosexuals, idealiza- 
tion of the role of the typical heterosexual 
male, and identification with the typical het- 
erosexual male rather than with the typical 
homosexual male. 

Since this was not a random sample of 
homosexual males, it is recommended that 
caution be used in generalizing from these 
results. 
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A SECOND VALIDATION OF A LONG-TERM 
RORSCHACH PROGNOSTIC INDEX FOR 
SCHIZOPHRENIC PATIENTS* 


ZYGMUNT A. PIOTROWSKI ax» BARRY BRICKLIN 
Jefferson Medical College of Philadelphia 


In 1952 a group of Rorschach prognostic 
signs were presented with which the follow-up 
Conditions of schizophrenic patients could be 
predicted (Piotrowski & Lewis, 1952). An es- 
sentially postdictive methodology was used, 
i.e., the prognostic signs were stated after the 
follow-up groupings had been formed. , In 
1958 these signs were applied to 30 schizo- 
Phrenic patients, some of whom improved 
Over an interval of at least 3 years and some 
of whom remained unimproved over the same 
interval. On the basis of this application the 
Signs were revised (in order to increase reli- 
ability) resulting in the 1958 prognostic 1m- 
dex. The 1958 prognostic index was then vali- 
dated on a group of 70 schizophrenics (Pio- 
trowski & Bricklin, 1958). The purpose of the 
Current investigation was to test the validity 
Of the index on a group of patients who dif- 
fered in many important respects from the 70 
Used in 1958, The prognostic index was again 
revised slightly, and was revalidated on 105 
additional schizophrenic patients (the 195 
Stroup). This slightly revised version was then 
Teapplied to the 1958 sample so as to facili- 
ate comparisons between the two groups. It 
Should be kept in mind that the 1959 G 
Vision did not change the cutoff point, a 

e relative distribution of the 1958 cases a 
not changed in any way, i.e, the two a 
Constitute independent validating evidence tor 

© prognostic index. 
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METHODOLOGY 


The essential validating methodology in both the 
1958 and the 1959 samples was to compare predic- 
tions made on the basis of the Rorschach prognostic 
index against follow-up statements made on the basis 
of independent clinical judgments. 

In the selection of cases, a rule of inclusion was 
that there be copious follow-up data on each schizo- 
phrenic patient, consisting of psychiatric interviews, 
psychiatric social worker interviews, interviews of 
the patient's family members by psychiatric social 
workers, and staff conference reports. This informa- 
tion had to extend at least 3 years past the time at 
which a Rorschach test had been administered. It 
was to this Rorschach test that the prognostic index 
was applied. The actual year in which the Rorschach 
had been given was unimportant so long as there 
was follow-up data at least 3 years subsequent to 
this year. The other standard of inclusion was that 
each patient be independently diagnosed as schizo- 
phrenic. 

The first step in the procedure was to search 
through the psychological test files and locate all 
schizophrenic cases to whom Rorschach tests had 
been administered at least 3 years ago. The second 
step was to see if there was extensive follow-up data 
on all such cases which extended at least 3 years 
subsequent to the time of Rorschach testing. The 
third step was to insure that each case had main- 
tained the diagnosis of schizophrenia over the entire 
interval for which information was available. The 
same clinicians who made the follow-up designations 
(see below) made all decisions to include or exclude 
cases. All such decisions were made without knowl- 
edge of Rorschach results. Only one-fourth of the 
Rorschach records taken at least 3 years earlier could 
be used. The remaining three-fourths of the cases did 
not have adequate follow-up information. 

The methodology consisted of comparing predic- 
tions made on the basis of the Rorschach prognostic 
index against follow-up statements as designated by 
experienced clinicians working independently, using 
the clinical and life history data enumerated above 
(Lewis & Piotrowski). On the basis of this follow- 
up information each patient was designated as im- 
proved or unimproved over the x-year interval, 
where x was at least 3. The prognostic index was 
applied to the Rorschach tests each one of which 
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had been administered at least 3 years prior to the 
time to which the follow-up extended. All Rorschach 
tests were identified by number only; the rater 
(Bricklin) had no other information at his disposal. 
This design eliminated the possibility of contaminat- 
ing factors. The rater who applied the prognostic 
index had no data as to the follow-up conditions of 
the patients; the clinicians who made the follow-up 
designations had no Rorschach data. A prediction 
(improved, unimproved) based on the Rorschach 
prognostic index was made for cach case. A score of 
+2 or more would predict the patient to remain un- 
improved; +1 or less would indicate the patient to 
be improved. These predictions were then confronted 
with the independent follow-up designations _(im- 
proved, unimproved) and the two results were com- 
pared by the chi square technique. 

The follow-up length (at least 3 years) was used 
not only in order to make possible a meaningful 
statement as to follow-up condition but to insure 
the correctness of the initial diagnosis. Since our 
methodology demanded that there be extensive fol- 
low-up data on cach patient, the chance of our hav- 
ing included a nonschizophrenic patient was, for all 
intents and purposes, ruled out. 

The 1958 and the 1959 samples of schizophrenic 
patients were chosen so as to differ from each other 
in age, intelligence, sex composition, socioeconomic 
Status, and duration of time elapsed between onset 
of manifest psychosis and Rorschach examination 
time. The purpose was to validate the prognostic 
index on as wide a range of schizophrenic patients 
as possible. 


Follow-Up Designations 


The same clinical follow-up criteria of improve- 
ment and unimprovement were used in both the 
1958 and the 1959 groups. Every available source of 
information—psychiatric interviews and evaluations 
plus psychiatric social work reports, family reports 
on patients’ adjustment and behavior, and staff con- 
ference notes—was scrutinized. The patient had to 
improve in all three of the following areas to be 
designated as improved. If he failed to improve in 
all three areas or grew worse, he was classified as 
unimproved. The three areas are: 

1. Thought processes. The relevance, coherence, 
sense of reality, comprehensiveness, consistency, con- 
fidence, and valid self-criticism in making state- 
ments, were considered. 

2. Psychosocial relations and work. The capacity 
of the patient to form meaningful emotional rela- 
tions with others was considered. The employment 
record of each patient was analyzed from two stand- 
points: as an additional check on his capacity to live 
with others, and to yield information on his ability 
to do some kind of work reasonably effectively. On 
hospitalized patients the hospital work history was 
considered. The patient had to display an increased 
capacity for productive work and for meaningful 
and more constructive interhuman relations to be 

designated as improved in this area. 
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TABLE 1 


POPULATION CHARACTERISTICS OF 1958 
Group Anp 1959 Group 


1958 Group 1959 Group 


Variables (V = 70) (V = 103) 
Being = 
Compared Mean SD Mean SD t 
Age 28 8.4 34 7.2 4.70* 
1Q 118 17.0 97 15.1 8.24* 
Follow-up 
Interval 6.0 3.7 6.5 2.8 1.06 
* p <.01. 


3. Attitude towards self. This refers to the degree 
of anxicty and to the degree of self-acceptance. To 
be designated as improved in this area, the patient 
had to be more comfortable with himself, and had 
to give evidence of feeling in a realistic way that his 
life had become less troublesome and less difficult. 

It is important that two points be kept in mind 
when follow-up designations are to be made, (a) 
Some symptoms of schizophrenia generally persist 
even in improved cases. There is generally some de- 
gree of affective blunting. Some traces of delusional 
thinking generally persist, even in improved cases. 
(b) When dealing with one of those patients termed 
“episodic” by Bleuler (1950), it is important not to 
consider daily (or even hourly) fluctuation in condi- 
tion as essential change. This is especially true of 
manic or depressive mood phases. These mood phases 
are almost always transient. An inspection of the en- 
tire duration of the follow-up interval must be made 
until its trend is revealed. With episodic patients the 
frequency of lucid intervals, and the degree of de- 
fect shown in the lucid intervals, become the decid- 
ing criteria. 


1958 Sample 


It had been possible to isolate from our psycho- 
logical test files 70 cases which met the conditions of 
inclusion, i.e., patient diagnosed schizophrenic, Ror- 
schach test available, follow-up information extend- 
ing at least 3 years subsequent to Rorschach, diag- 
nosis of schizophrenia maintained over entire in- 
terval for which information was available. The 
condition most difficult to meet in this group as 
well as in the 1959 group was that of obtaining huge 
masses of follow-up information so that an accurate 
appraisal as to the trend of each patient’s illness 
could be made. 

As shown in Table 1, the average intellectual level 
was above average in the 1958 group, with a mean 
Wechsler-Bellevue IQ of 118, the standard deviation 
being 17. Nearly all of the patients were first admis- 
sions and the Rorschach tests to which the sign’ 
were applied had been administered 2 months to ? 
years after the onset of manifest psychoses. The dura- 
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tion of this particular interval was gathered from the 
follow-up sources listed above and did not neces- 
sarily correspond to the duration of hospitalization. 
The mean age of the 1958 group was 28 years, the 
standard deviation being 8.4 years. All of the pa- 
tients in this group belonged to the middle and high 
middle socioeconomic classes. There were 29 men and 
41 women in this sample. The mean follow-up in- 
terval (time elapsed between that point, at which 
the Rorschach was administered and the time to 
which the patient was followed) was 6.0 years, the 
standard deviation 3.7 years. 


1959 Sample 


All patients met the conditions of inclusion as out- 
lined previously. These patients were chosen so as to 
differ in the above mentioned respects from the 1958 
group. To satisfy these requirements the patients 
Were selected from a Veterans Administration Re- 
gional Office (53 patients) in Philadelphia and from 
the Veterans Administration Hospital (50 patients) 
at Coatesville, Pennsylvania. The 103 veterans in 
this group had become manifestly psychotic from 2 
to 10 years prior to the time at which they were 
administered the Rorschach test. At the time of ex- 
amination their mean Wechsler-Bellevue 1Q was 97, 
the standard deviation 15.1 as is shown in Table L 

e mean age was 34 years, the standard deviation 
7.2. The 1959 group contained 97 men and 6 women. 

he differences between the 1959 group and the E 
group in IQ and age are statistically significant ($ 
01) as are the differences in interval between onset 
of manifest psychosis and psychological examination, 
and sex composition, Practically all members of the 
1959 sample belonged to a socioeconomic class below 
the middle class. The mean follow-up interval in 

is group was 6.5 years, the standard deviation 2.8. 

e durations of times over which the patients were 
ollowed subsequent to their Rorschach tests did not 


differ significantly in the two groups. 


RESULTS 
Validation 


1958 Sample. There was one disagreement 
among the two clinicians as to follow-up desig- 
nations, This difference was reconciled in open 
Conference until one decision was reached for 

€ case (for design purposes) - me. 

Tt fl oan hypothesized that high index 
Scores would be associated with failure to im- 

Tove, and low index scores with improve- 
Ment, On the basis of the previous investiga- 
ions, +2 and more was chosen as the ee 
Point that would distinguish the ngimproye 
Patients from the improved patients. ma e 
8 validation group (Piotrowski & Bric ity 
1958), 49 patients obtained scores of at leas 
ma Points; 45 of them were worse or did not 
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TABLE 2 


Cut SQUARE ANALYSIS OF PROGNOSTIC INDEX 
ES OBTAINED IN 1958 Group 


Follow-up Status 


Index Scores Improved Unimproved Total 

2 and more 4 45 49 

1 and less 18 3 21 
Total 22 48 70 


Note.—df = 1, x? = 41.03, p <.01. 


change during the follow-up period, i.e., 45 
of these cases had been independently desig- 
nated as unimproved. Of the 21 patients with 
scores of less than +2 points, 18 were inde- 
pendently designated as improved. The prog- 
nostic index correctly predicted 90% of the 
70 patients’ follow-up conditions. As may be 
noted on Table 2, the chi square value of 
41.03 (df= 1) indicates that such results 
could occur by chance less than 1 in 100 
times. 

1959 Sample. There were two disagreements 
between the two clinicians as to follow-up 
designations. As above, these differences were 
reconciled in open conference until one de- 
cision was reached for each case. Seventy- 
two schizophrenic patients attained scores on 
the prognostic index of at least +2 points; 
70 of these patients had been independently 
designated as unimproved. Thirty-one pa- 
tients attained index scores of less than +2 
points; 22 of these patients had been desig- 
nated as improved. The chi square value of 
56.66 (df = 1) indicates that such an asso- 
ciation between the index scores and follow-up 
conditions could have occurred by chance less 
than 1 in 100 times. Thus the follow-up con- 
ditions of 89% of the patients had been cor- 
rectly predicted by the prognostic index in 
the 1959 group (see Table 3). 


Comparison of Results 


Minor changes have been made in the pres- 
entation of the 1959 data (from the 1958 
data) in the grouping of several signs and in 
the weightings of two. Signs 3 and 6, and 
Signs 5 and 7, were combined because of 
the relatively low frequency of occurrence of 
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TABLE 3 


CHI SQUARE ANALYSIS OF PROGNOSTIC INDEX 
SCORES OBTAINED IN 1959 Group 


Follow-up Status 


Index Scores Improved Unimproved Total 

2 and more 2 70 72 

1 and less 22 9 31 
Total 24 79 103 


Note.—df = 1, x? = 56.66, p < 01. 


Signs 6 and 7. The weightings of Signs 2 and 
combined 3 & 6 were lowered from 3 points 
to 2 points. This revised index was applied 
to both the 1958 and the 1959 validation 
groups in order to make the results directly 
comparable. Neither the original validity of 
the single signs nor of the original prognostic 
conclusions in the 1958 group was affected in 
any way. The same cutoff point of +2 was 
used with both groups. The weighting changes 
did not affect the relative distribution of 
cases in the 1958 sample. 

As can be seen, the prognostic index suc- 
cessfully predicted the follow-up conditions of 
90% of the patients in the 1958 group, and 
89% of those in the 1959 veteran group. 

The variables in which the two validation 
groups differed—intelligence, age, etc.—did 
not affect the validity of the prognostic in- 
dex. There is a tendency among the veterans 
(the 1959 group) to show a somewhat lower 
incidence of improvement in the low score 
group. In the 1959 veteran group, of the 31 
persons obtaining a prognostic index score of 
1 or less, 71% actually were improved. In 
the 1958 group, of the 21 patients obtaining 
the same score, 86% improved. This finding, 
which may be related to differential treat- 
ment procedures, remains to be investigated. 

The 1959 veteran group was composed of 
ambulatory or milder VARO cases (N = 53), 
and more severe VAH cases (N = 50). The 
1958 group (W = 70) falls between these two 
other groups in terms of severity of illness; 
these patients were hospitalized but „were 
early and mild cases at the time. It is inter- 
esting to note that the mean prognostic index 
scores were 2.8 (SD = 3.3) in the VARO 
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cases; 3.9 (SD =4.3) in the 1958 group; 
and 5.8 (SD = 4.0) in the VAH group. The 
mean prognostic index scores reflect the in- 
creasing severity of defect in the three groups. 
The difference in the prognostic index scores 
of VARO and VAH cases is significant at 
p < .01 level; that between the 1958 group 
and the VAH cases at p < .02. The difference 
in scores between the 1958 group and the 
VARO cases falls at the p< .10 level (¢ 
tests). 

Two of the signs, 3 & 6 combined and 4, 
were more frequent in the 1958 group. Thus 
it is possible that these signs are related to 
intelligence. The appearance of these signs 
apparently requires on the part of the pa- 
tient a critical attitude toward thinking and 
some facility in verbalizing thoughts, with the 
concomitant condition that these thoughts and 
their evaluations be defective. However, the 
habit itself of thinking and speaking in terms 
of probabilities rather than certainties (“could 
be, I don’t know”; “might be anything that 
has a shape”; etc.) is correlated with intelli- 
gence and therefore shows up to a greater 
extent in the brighter 1958 group. Signs 11 
(determinant scarcity) and 12 (content mo- 


TABLE 4 


Tue Procnostic SicNs, THEIR WEIGHTINGS, 
AND Cut SQUARE VALIDITY 


(N = 173) 
Sign is 
Number Name of Sign* b 

1 Human Movement Responses 0 or 1 and 

Sum Color Responses outweighs Sum M by 

at least 3 (4) <01 
2 Response repetition (perseveration) (2) <.05 
3&6 Vagueness of perception and meaning or in- 

appropriate conceptual connection (2) <.02 
4 Indeterminate form responses (2) <.02 
5&7 Breakdown of interpretive attitude or blur- 

ring of difference between imagination and 

sensation (2) <.01 
8 Absurdly inconclusive explanations (2) on 
9 Absence of human content (2) 20 
10 F+% below 60 (2) <0 
1 Determinant scarcity (1) <.01 
12 Content monotony (1) zo 
13 No Human Movement Responses (2) An 
14 At least 5 Human Movement Responses (—2) <0! 


For a more de- 


a The weightings are given in parentheses. the 


tailed description of each sign, including examples of each, 
reader is referred to Piotrowski and Bricklin (1958). 
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notony) were more'frequent in the less bright 
1959 veteran group. This would be expected. 
As a rule, the variety of Rorschach compo- 
nents decreases with decreasing intelligence. 
By retaining signs which appear with differ- 
ing frequency in varying populations we are 
able to apply the same set of signs to differ- 
ent types of schizophrenics. 

The prognostic validity of each sign was 
Measured by the chi square technique. This 
was done for the 1958 and the 1959 groups 
separately as well as for the entire sample. 
Using the entire sample of 173 cases, a four- 
Cell contingency table was formed: the im- 
Proved and unimproved patients formed one 
dimension, those manifesting and those not 
Manifesting the sign the other dimension. 
The results are given in Table 4. The fol- 
lowing signs differentiated between the im- 
Proved and unimproved cases at the p < 01 
level: 1, 5 & 7, 9, 10, 11, 12, 13, and 14. 
Three signs differentiated at the p< 02 
level: 3 & 6, 4, and 8.2 Sign 2 differentiated 


at the p < .05 level. 


Reliability 

The reliability of the prognostic index pe 
tested independently of the main study y 
aving five raters apply the index to each o 
10 schizophrenic cases, and by ha 
Other rater and one of us (Bricklin) e 
Þendently apply the index to 25 schizophrenic 
Cases, 


aving an- 
inde- 


Five raters were given seven uma 
and three improved schizophrenic Rorschach 
records, These records were chosen at mi 
dom from among improved and Ga oa 
cases in proportion to the rate at whic Lo 
Proved and unimproved schizophrenic os s 
aDpear in the general schizophrenic mpr 
tion, The raters, of course, had no knowledg 
°F this decision rule. ae 
| There was no disagreement in the oe A 
rent patients whom all five raters place he 

© same group, improved or unaa ie 
One rater disagreed with the rest of then y 
Placing one unimproved patient amons T ie 
“proved; and one rater disagreed with is 
“lows by placing an improved patient among 
all individually valid at 


“Signs 3, 3, 6, and 7 were pare 
i AE ane are in combination. 


™ least the p< 05 level, as they 


the unimproved. Out of a total of 50 predic- 
tions, only 2 were incorrect as to prognostic 
conclusion, On four patients, all five raters 
had identical prognostic index scores. On four 
other patients the greatest difference in scores 
among all five raters was only two points. On 
the remaining two patients, the highest and 
lowest scores differed by four points; how- 
ever, this difference was critical in only one 
case by leading to a prognostic conclusion op- 
posite to that of the majority of raters. 

In addition, one other rater (Carter Zelez- 
nik) applied the index to 18 unimproved and 
7 improved cases chosen in the same manner 
as above. These same cases were independ- 
ently scored by one of us. The prognostic in- 
dex scores differed by two points in four 
cases, and by one point in one case. In only 
one case, however, did the difference (of two 
points) influence the prognostic conclusion. 
The reliability can be considered satisfactory. 


DISCUSSION 


Long experience has shown, as Bleuler 
(1950) noted, that a schizophrenic frequently 
undergoes marked and unpredictable changes 
in personality during the first years after the 
onset of manifest psychosis. Such marked and 
unpredictable changes are rare 3 or more 
years after the onset of manifest psychosis. 
This factor makes long-term prognostic in- 
vestigations more realistic, at the present 
time, than short-term investigations. The 
rapid alternations of condition often so char- 
acteristic of the early years of schizophrenia 
render the problem of making accurate and 
meaningful follow-up statements most com- 
plex. There is a strong tendency for the 
eventual course of the illness to make itself 
known after the first 3 years. The frequency 
of essential changes in condition is exceed- 
ingly low after the first 3 years. Another 
factor which complicates short-term prog- 
nostic study is the differential effects of vari- 
ous treatment procedures. Psychotherapy and 
other therapeutic procedures are much more 
effective in the beginning of manifest psycho- 
sis than in later years when schizophrenics 
become much less responsive to environmen- 
tal influences, including therapy (Gottlieb & 
Huston, 1943). This decrease in personality 
variability with time favors long-term prog- 
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nostic research and greatly complicates short- 
term prognostic studies. It may be mentioned 
that, in this study, differential treatment pro- 
cedures did not seem to affect the validity of 
a long-term prognostic index prediction in 
any consistent manner. 

Other attempts to prognosticate the out- 
come of schizophrenia have taken many 
courses. Kantor (1953), among others, has 
differentiated so-called “process schizophren- 
ics” from “reactive schizophrenics,” the prog- 
nosis of the latter being more favorable than 
that of the former. The reactive cases are 
characterized by a normal prepsychotic per- 
sonality and an acute onset usually accom- 
panied by a “logical” precipitating factor. 
The reactive cases are also characterized by 
a clouded sensorium. The process types are 
best characterized as “thinking disorders” 
and generally have insidious onsets of dis- 
ease. In a system generally similar to this, 
Langfeldt (1937) has distinguished so-called 
“typical” (process) and “atypical” (reactive) 
cases. 

One may also find in the literature many 
studies which list clinical symptoms or syn- 
dromes of schizophrenia along with the per- 
centage of improved cases associated with 
each. 

Difficulty in using the above mentioned pro- 
cedures as prognostic tools has to do not with 
their essential validities, but with the diffi- 
culty of adapting them to the individual case. 
Many of these approaches depend on accu- 
rate and reliable case histories which arë 
often impossible to obtain. It may also be 
noted that clinical signs and syndromes more 
often than not are highly variable in the in- 
dividual case. 


SUMMARY AND CONCLUSIONS 


In 1958 a Rorschach prognostic index for 
the prediction of a schizophrenic’s clinical 
condition (improved or unimproved) 3 or 
more years after testing was offered. The 
index had been validated on a group of 70 
followed-up patients (Piotrowski & Bricklin, 
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1958). The present report describes the sec- 
ond validation of the index on a group of 103 
schizophrenic patients, differing from the first 
patient group in many ways, including aver- 
age intelligence, age, severity of illness, and 
distributions of sexes. The results were virtu- 
ally the same. In 90% of the cases in the first 
group, and in 89% of the cases in the second 
group, the prognostic index successfully pre- 
dicted the outcome conditions of the schizo- 
phrenic patients as either improved or unim- 
proved. 

Since the implications of a long-range prog- 
nostic index which validly differentiates be- 
tween schizophrenics who will be improved or 
unimproved are obviously serious, it is ad- 
visable to submit the prognostic index to ad- 
ditional tests. It must be remembered that 
the index applies to schizophrenics, and not 
to cerebral organic cases or psychoneurotics. 
The validity of the diagnosis of schizophrenia 
is an essential factor determining the degree 
of validity of the index. 


REFERENCES 


BLEULER, E. Dementia praecox or the group O 
schizophrenics. New York: International Univer. 
Press, 1950. 

Gorrzzes, J. S„ & Huston, P. E. Treatment o 
schizophrenia: Follow-up therapy in cases of a 
sulin shock therapy and in control cases. Arch. 
Neurol. Psychiat., 1943, 49, 266-271. a 

Kantor, R. E., WALLNER, E., & Wenpner, C. L. 
Process and reactive schizophrenia. J. consult. Psy- 
chol., 1953, 17, 157-162. : 

Lancretot, G. The prognosis in schizophrenia and 
the factors influencing the course of the disease. 
Acta psychiat., 1937, Suppl. 13. (See also Suppl- 
80, 110, 113) 

PIOTROWSKI, Z. A, & Brickrrn, B. A long-term 
prognostic criterion for schizophrenics based on 
Rorschach data. Psychiat. Quart. Suppl., 1958, 32) 
315-329. 

Protrowsxi, Z. A. & Lewis, N. D. C. An experi- 
mental criterion for the prognostication of the 
status of schizophrenics after a three-year interval 
based on Rorschach data. In P, Hoch & J. Zubin 
(Eds.), Relation of psychological tests to psychi- 


atry. New York: Grune & Stratton, 1952. Pp. 51- 
12s 


(Received February 25, 1960) 


Journal of Consulting Psychology 
1961, Vol. 25, No. 2 129-136 


aR 


=y: 


Following the reports by Price and Deabler 
(1955) and Garrett, Price, and Deabler 
(1957) of studies in which perception of the 
negative spiral aftereffect (SAE) was found 
to discriminate with great accuracy between 
brain damaged patients and two nonorganic 
Control groups, the clinical implications of 
this Perceptual phenomenon have been studied 

Y many different investigators. Some, such 
as Gilberstadt, Schein, and Rosen (1958) and 
Philbrick (1959), have carefully followed the 
Methods of Price and Deabler (1955). Others, 
Such as Gallese (1956), Davids, Goldenberg, 
, d Laufer (1957), Spivak and Levine (1957), 
| aad Page, Rakita, Kaplan, and Smith (1957), 
te used somewhat different procedures and/ 

r scoring systems, While the variations em- 
Ployea a E sodien studies make it difficult 
© Compare their results precisely, at least one 
Seneralization appears to be warranted. While 

Tain damaged subjects (Ss) as a group do 
tend to report the negative SAE less fre- 
dently than do either normal Ss or nonor- 
Banie psychiatric patients, the differences be- 

cen groups are far less clear-cut than origi- 
nally reported by Price and Deabler. aid 
n trying to explain this discrepancy, eY- 
eral lesa inateset HATE suggested that the 

OMposition of the subject samples employed 
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in different studies has been a crucial factor, 
In many cases the brain damaged Ss were 
older, more chronic patients than were the 
control groups studied. Location and type of 
brain damage are very probably other fac- 
tors of importance. Both Gallese (1956) and 
Page et al. (1957) noted that many pre- 
frontal lobotomy cases reported the SAE as 
readily as did normal Ss, and Aaronson 
(1958) has suggested that involvement of the 
temporal lobes is especially likely to eliminate 
the normal SAE response. Other investigators 
have raised the possibility that the apparent 
decrements shown by the brain injured may 
result chiefly from an inability to report the 
aftereffect rather than from failure to perceive 
it. Gallese (1956) attempted to deal with this 
problem by using more directive and prob- 
ing instructions than did Price and Deabler 
(1955), while at the same time he revised his 
scoring procedures to reduce the likelihood 
that reticence or difficulty in report would 
contribute to low scores. 

In the studies cited above, there have been 
only minor variations in the stimulus condi- 
tions under which the spiral was presented. 
Although the conditions utilized by Price and 
Deabler (1955) seem to be close to optimal 
for perception of the SAE by normal Ss, no 
attempt has been made to establish the opti- 
mal conditions for its perception by the brain 
injured or for differentiating between brain 
damaged and nonorganic Ss. The present 
study was designed to investigate the effects 
of varying concomitantly two easily con- 
trolled and readily quantified stimulus con- 
ditions, which, from studies of other afteref- 
fect and apparent motion phenomenon (Ham- 
mer, 1949; Teuber & Bender, 1949), seemed 
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TABLE 1 


DESCRIPTION OF PATIENTS STUDIED 


Cortical Damage Group 


NP Control Group 


Diagnosis Number Diagnosis Number 
Chronic brain syndrome, Depressive reaction 17 
associated with trauma 20 

Schizophrenia 10 
Cerebral vascular accident 13 
Anxiety reaction 10 
Convulsive disorder 5 
Psychosomatic disorders 6 
Cerebral arteriosclerosis 5 
Character disorder 4 
Cortico-striato-spinal disease 2 
Paranoid state 1 
Alzheimer’s disease 2 
Conversion reaction 1 
General paresis 1 
Phobic reaction 1 
Brain tumor 1 
Chronic brain syndrome, alcohol 1 
Total 50 Total 50 
Average Age 42.5 Average Age 45.0 
Years of formal schooling 9.4 Years of formal schooling 10.0 


likely to be of significance in determining 
aftereffect perception. These variables are the 
speed at which the spiral is rotated and the 
length of time for which S observes the ro- 
tating spiral. In setting up this study an at- 
tempt was made to control as many as pos- 
sible of the factors which have been suggested 
above as possible explanations for the range 
of results found by other investigators. 


METHOD 


Subjects. Three groups of Ss were included in this 
study. 

1. The Cortical Damage Group consisted of 48 
male and 2 female patients considered by the physi- 
cians in charge of their respective cases to have 
demonstrable organic cortical damage. Many of them, 
of course, also had damage at subcortical levels. 
These patients were recommended for this study by 
their case doctors or other interested staff members, 
as not too severely aphasic or otherwise distressed 
to be able to cooperate in the experiment. All of the 
available information concerning the patient’s brain 
damage was gathered from his clinical chart and 
from a short conference with his case physician, 
usually a resident in neurology or neurosurgery. The 


diagnoses listed for these patients at the time of 
testing are given in Table 1. 

2. The Rewoperhinitic (NP) Control Group yeas 
sisted of 42 male and 8 female patients with i 
known or suspected organic brain pathology, Pe 
were referred for routine psychological testing. Th 
older patients in this group (60 years and older) 
were specifically recommended for this study by 
their physicians as showing no clinical signs of or- 
ganic brain damage. Diagnoses of this group are also 
listed in Table 1. The patient groups were compa- 
rable in age and educational level, as indicated iN 
Table 1. All patients in this study had been hospital- 
ized for 6 months or less, although for many © 
them this was not a first admission, 

3. The College Student Group consisted of 35 fe- 
male and 15 male undergraduates who were require 
to take part in a number of experiments to fulfill 
their Introductory Psychology course requirements. 
This group was, of course, considerably younger an 
better educated than the patient groups, and un- 
hospitalized. 

A total of 2 brain injured and 6 psychiatric Pp 
tients had to be eliminated from the study becaus® 
they were unable or unwilling to cooperate. 

Apparatus. The spiral used here was a 920-degre® 
Archimedes spiral made from a tracing of the one 
used by Price and Deabler (1955), altered to a di- 
ameter of 7 inches. It was rotated by a device with 
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a variable speed control which permitted rotation at 
speeds of from 18 to 90 rpm and a pulley arrange- 
ment which allowed rotation in either direction. This 
apparatus was located slightly below eye level for 
the average S at a viewing distance of approximately 
6 feet. 

Experimental design. The rotation speeds used were 
Selected on the basis of a preliminary study with 
cight normal Ss as speeds which represented an ap- 
proximately optimal condition (90 rpm), a deñ- 
nitely nonoptimal condition (18 rpm), and the mid- 
Point between these extremes (54 rpm). Two ex- 
Posure times were used. Ten seconds appeared to be 
a near-minimum time for SAE perception, while the 
30-second exposure was one which earlier investi- 
gators had employed successfully, obtaining after- 
efiect reports from nearly all of their normal Ss. 

Ten trials were presented to each S in a fixed or- 
der of decreasing difficulty as determined by the pre- 
liminary study, The first eight trials represented all 
Combinations of 18 and 54 rpm, 10- and 30-second 
exposure times, and clockwise (CW) and counter- 
Clockwise (CCW) directions, providing a 2 X 2 X 2 
factorial design applied to cach S. The last two trials 
Were run under conditions similar to those employed 
in most previous clinical studies, thus providing 5 
basis for comparison with the results of earlier in- 
Vestigators. 


Procedure, Ss were asked if they could see 
ast s ; 
Spiral clearly, and were instructed to look directly 

i. e spiral 


at it during all trials. They were told that th 
Would be rotated and that they were to watch for 
äPparent expansion, contraction, or changes in depth 
or distance, The exact wording varied somewhat de- 
Pending upon Ss’ educational level. Ss were asked to 
describe what the spiral appeared to be doing while 
it was rotating, and again when it was halted. Just 
defore stopping the rotation on each trial, the ex- 
Perimenter reminded S that he was to keep looking 
at the spiral even when the apparatus was turned 
off. AN responses were recorded as nearly verbatim 
as Possible, and in doubtful cases a brief inquiry was 
Conducted at the end of the 10 regular trials. 


TABLE 2 
Pi tg AFTEREFFECT 
Percentace or Subjects REPORTING DATERER 
UNDER Eaci SET OF CONDITIONS 


=. nee - 
Cortical NP s 
Conditions Damage Control Student 
ea ji oe 7 
18-CCW-10 6% 60% 10% 
18-CW-10 12 70 bes 
54-CW-10 16 88 pi 
54-CCW-10 16 72 A 
18-CCW-30 16 78 1 
18-CW-30 22 94 = 
54-CW-30 2 92 oo 
S4-CCw-30 26 a 4 
90-CCW-30 30 n Fi: 


90-CW-30 52 


100 
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7 e” 
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soJe) Y 
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Cortical Damage 
AFTER ROTATION 


5 6 
CONDITIONS 


| 2. 3 4 i 8 9 10 


1. Percentage of subjects reporting apparent 
size and/or depth changes. 


RESULTS 


Of primary interest in this study were the 
effects of stimulus variation upon report of 
the spiral aftereffect by the different groups 
of Ss. Table 2 shows the percentage of Ss in 
each group reporting the SAE under each 
combination of conditions. The sets of condi- 
tions are listed in the fixed order in which 
they were presented, with rotation speed given 
first, followed by the direction of rotation, 
and then by the exposure time. Clockwise ro- 
tation normally produces an aftereffect of ex- 
pansion, while the CCW rotation usually pro- 
duces a contraction aftereffect. 

Although the SAE was of particular con- 
cern, the responses given both during rotation 
of the spiral and immediately after its cessa- 
tion were analyzed. Figure 1 presents the per- 
centage of Ss in each group who reported ap- 
parent size and/or depth changes during ro- 
tation and during the usual aftereffect period 
for each set of ‘conditions. The sets of condi- 
tions are listed in the order in which they 
were presented. Obviously there is little dif- 
ference among these groups with regard to 
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TABLE 3 
SUMMARY OF ANALYSES OF VARIANCE OF PROPORTIONS OF SUBJECTS REPORTING AFTEREFFECT 
Mean Square F 
Source df Nonorganic Organic Nonorganic Organic 
Exposure Times (A) 1 0.4857 0.2560 48.57** 12.80** 
Rotation Speeds (B) 1 0.1567 0.1580 15.67** 7.90* 
Rotation Direction (C) 1 0.4349 0.0623 43.40** 3.12 
AXB 1 0.0210 0.0070 2.10 0.35 
AXC 1 0.0042 0.0099 0.42 0.49 
BXC 1 0.0004 0.0001 0.04 0.01 
AXBXC 1 0.0101 0.0199 tor 1.00 
Error 0.0100 0.0200 
Total 7 
*p<.01 
** p< 001 


perception of apparent size and depth changes 
during rotation of the spiral. A medians chi 
square test done on these data supports this 
conclusion. There is, however, a clear differ- 
ence with regard to the SAE. While the Stu- 
dent and NP Control Groups respond very 
similarly, the Cortical Damage Group reports 
the SAE much less frequently. The condi- 
tions which apparently produce the greatest 
differentiation of organic and nonorganic Ss 
are those of medium difficulty. The condi- 
tions most nearly approximating those of 
previous studies discriminate much less well. 

Because the response of the Cortical Dam- 
age Group was so clearly different from that 
of the other two groups, separate analyses of 
variance of proportions were done, using the 


P 
NP Control = 


| 
L 


Srdan aN 


NUMBER OF SUBJECTS 
© 
1 


Fic, 2, Number of subjects receiving each SAE score. 


method of Walker and Lev (1953). These 
analyses are summarized in Table 3. All main 
effects were significant for the combined Stu- 
dent and NP Control Groups, while for the 
Cortical Damage Group, exposure time and 
rotation speed had significant effects, but di- 
rection of rotation did not. None of the inter- 
actions between variables was statistically sig- 
nificant. 

Since the clinical efficacy, of this phenome- 
non as a tool for use in the diagnosis of brain 
damage has been a major point of contention 
in the literature, each response of every S was 
assigned a score of 1 if it indicated the nor- 
mal SAE, 0 if it did not, following the pro- 
cedure of Gallese (1956). Thus each S re- 
ceived a score which was equal to the number 
of normal SAE responses. Figure 2 shows the 
number of Ss receiving each score, As this fig- 
ure shows, the Cortical Damage Group differs 
markedly from the other groups in terms of 
the number of SAE responses reported, Of the 
50 Cortical Damage Ss, 44 reported the SAE 
fewer than 6 times in 10 trials, while 46 of 
the 50 NP Controls reported the SAE on 6 or 
more of their 10 trials, A medians chi square 
test on these data indicates that this differ- 
ence between groups is significant well beyond 
the .001 level 

Information Concerning the location of each 
patient’s brain damage, insofar as it could be 
ascertained, is presented in Table 4, along 
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TABLE 4 


AFTEREFFECT “SCORES” FOR DIFFERENT 
AREAS OF BRAIN DAMAGE 


Location of Damage Mean SD Range N 
1.26 133 0-5 19 
200 208 0-6 11 
x 167 — 0-3 3 
ila 10 — 0-2 2 
Bilateral Frontal-Temporal 0:00 — — 1 
Left Frontal 1000 = 1 
Left Frontal-Temporal 5.20 2.58 2-9 A 
eft Temporal 1.00 — = 1 
eft Occipital 200 == O-4 2 
Right Frontal-Temporal oo — — I 
ight Temp 9.00 pn my, 3 
Right Parietal 467 — 2-10 
Bilateral Damage 118 123 05 2 
nilateral Damage 3.25 3.12 0-10 28 


with data on the SAE scores for patients with 
damage in each area. Only 2 of the 50 pa- 
tients in this group came to autopsy during 
the period of this study, and even in these 
Cases complete histological examination of the 
brain was not carried out. Therefore, it must 
be recognized that the location of damage 
is not always clearly delineated or precisely 
Placed. These data, based on the best infor- 
mation available, can only be offered as being 
Suggestive, not conclusive. Unfortunately, me 
number of patients with damage apparen 4 
limited to any one discrete aera is too ce 
to permit drawing definite conclusions abou 
the relationship between the location of ata 
age and report of the SAE. There do, on 
ever, appear to be differences associate ba 3 

€ extent of damage, bilateral or genera ae 
‘mage being associated with lower ree ; 

an unilateral or localized damage. A z eel 
With correction for heterogeneity of ape 
dicated that the unilateral vs. bilatera at) 
erence was statistically significant (Pp < 917- 


Discussion 


As might have been anticipated a 
rk done on other perceptual Eat 
€ results of this study show that bot nen 
ning the time of exposure to the pi A 
Spiral and decreasing its rotation spee hie is 
€tception of the SAE more difficult. ie 
“ue for both brain damaged and sca 
` rom the analyses of variance pres hich 
mn Table 3, it appears that the variable w! a 
Contributes most to the probability oi ee 
Tence of the SAE is the length of time 
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which the rotating spiral is observed. The 
amount of variance attributable to rotation 
speed is also statistically significant for all 
groups, but the actual sums of squares are 
much smaller than those for exposure time. It 
is interesting that direction of rotation should 
be a significant variable for the nonorganic 
Ss but not for the organic group. Apparently 
this is due to the fact that perception of the 
SAE was so difficult for the latter group un- 
der all but the most optimal conditions that 
the increment of difficulty added by CCW ro- 
tation made very little overall difference. 

The results of this study are clearly in 
agreement with Price and Deabler’s (1955) 
contention that report or nonreport of the 
SAE discriminates between brain damaged 
and nonorganic Ss with a fairly high degree 
of accuracy. Using the series of stimulus con- 
ditions developed for this experiment and a 
cutoff score at the median point for all pa- 
tients, i.e., between five and six reports of the 
SAE in 10 trials, 88% of the Cortical Dam- 
age and 92% of the NP Control patients are 
correctly identified. However, it must be 
pointed out that these Ss are not a random 
sample from a general hospital population, 
but rather specific patients for whom there 
was evidence of organicity or a lack of such 
evidence. 

The patients who are misclassified by this 
method of scoring can be clearly separated 
into two categories. Two of the Ss in each 
group received borderline scores, i.e., Te- 
ported the SAE sufficiently often to be just 
above or just below the cutoff point. Four 
Cortical Damage Ss and two NP Controls 
are markedly atypical of their respective 
groups in report of the SAE. 

The two NP Control Ss who failed to re- 
port the SAE on any trial were both over 60 
years of age. While neither showed any ob- 
vious clinical signs of deterioration such as 
are usually associated with brain damage, one 
could argue that in the absence of histological 
controls, organic brain pathology in Ss of this 
age could very well have been undetected in 
the general clinical picture, while still affect- 
ing SAE report. It should be pointed out, 
however, that many elderly patients received 
maximal or near-maximal SAE scores in this 


study. 
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The four Cortical Damage Ss who reported 
the SAE 9 or 10 times in 10 trials show a 
definite similarity in their respective clinical 
pictures, although the diagnosed area and type 
of damage is different in each case. Each of 
these patients has had very mild damage, al- 
ways very localized (as far as is known), and 
each shows little or no neurological or intel- 
lectual impairment. This would again support 
the conclusion that size and severity of dam- 
age are highly important factors in determin- 
ing whether or not the SAE is reported. Ex- 
tensive and/or severe damage to any part of 
the cortex appears much more likely to de- 
stroy the SAE than is localized damage. This 
may very well be one of the principal reasons 
for the great variation in the results reported 
by different investigators who have studied 
the spiral as a clinical diagnostic device. It 
seems a likely explanation for the “normal” 
responses of many of the lobotomy cases and 
epileptic Ss studied by various researchers. 

There is still the possibility, however, that 
difficulty in verbal report, which might be ex- 
pected to correlate with the severity of or- 
ganic damage, is the crucial factor. Both 
Gallese (1956) and Aaronson (1958) have 
raised this possibility. Furthermore. Schein 

(1960) and Van de Castle and Strong (1957) 

noted that with many patients who did not 
report the SAE Spontaneously, a slight altera- 

tion of the test stimulus would elicit such a 

report. The latter findings could mean that 
these patients were having difficulty in verbal- 
izing the SAE, much as Aaronson’s patients 
with anomia apparently did. It could also sug- 
gest, however, that this was a reflection of 
psychological rigidity, of the inability to shift 
one’s psychological set, which has so fre- 
quently been described as characteristic of 
the brain injured. Because of the explicit in- 
structions used in the present study and the 
inquiry conducted in doubtful cases, it is un- 
likely that the differences in SAE report seen 
here can be attributed primarily to verbal in- 
efficiency. Furthermore, in the analysis of re- 
sponses given during rotation of the spiral it 
was found that all except three of the Cortical 
Damage Ss reported the occurrence of appar- 
ent size and/or depth changes during rota- 
tion, indicating that it was not an inability 
to perceive, conceptualize, or verbalize such 
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phenomena which led to nonreport of the 
SAE. 

Nevertheless, the results of the above stud- 
ies raise a very important question. Can it be 
said that many brain damaged patients do 
not perceive the SAE, i.e., do not consciously 
experience it unless a new test figure is sub- 
stituted for the original? If so, this implies 
that the apparent loss of the SAE in brain 
damaged patients is due to something more 
than just a destruction of sensory elements in 
the cortical visual system. It would appear 
that for spontaneous perception of the SAE 
not only must the neuronal chains from retina 
to visual cortex be reasonably intact, but also 
that other parts of the brain which are in- 
volved in organizing and transforming sen- 
sory information into conscious awareness 
must be functional. If this is correct, it 
should not be surprising that mild, focal dam- 
age almost anywhere in the cortex does not 
interfere with the SAE, while larger, more 
severe damage usually does. 

It appears to me that much of our current 
theorizing concerning the apparent destruc- 
tion of the SAE by brain damage has been 
much too narrow in scope. We have been at- 
tempting to seek out a single comprehensive 
factor which would explain both the failure 
of many brain damaged Ss to report the SAE 
and the contradictory evidence which has 
been reported in several studies. The factor 
of impaired verbal report is perhaps the most 
frequent “cause” invoked to explain this dis- 
crepancy. I believe that in doing so, we have 
been very much inclined to oversimplify what 
is in actuality an extremely complex process. 
Because of the striking character of the phe- 
nomenon, it is easy to forget that the percep- 
tion of the SAE is in itself a highly complex 
operation. Multitudinous factors are involved 
in determining whether or not a particular © 
will perceive the SAE on any given trial. In 
the present study many Ss, even among the 
college students, sometimes failed to perceive 
the SAE under what are usually favorable 
conditions after having reported the phe- 
nomenon under more difficult conditions, This 
study has provided a systematic investigation 
of several stimulus variables, with certain 
subject variables controlled, but there are 
many variables which were not. and perhaps 
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could not be controlled in such a clinical ex- 
periment. We know from the work of Wohlge- 
muth (1911) and others that such variables 
as the size and viewing distance of the spiral, 
the state of light or dark adaptation of the 
eye, the intertrial interval, the constancy of 
fixation, and the fatigue state of the S can all 
be important. Brain damage, then, is not the 
only factor which may prevent the occur- 
rence of the SAE, and likewise, brain damage 
alone may not prevent such occurrence under 
otherwise favorable conditions. 

Not only is the SAE a complex phenome- 
hon, but the brain itself is such an extremely 
complex structure with at least some degree 
of localization of function, that it seems to 
me completely unreasonable to expect that 
brain damage as a general Clinical entity 
should affect SAE perception in the same 
Way in each patient. Damage of different de- 
Brees of severity to different areas of the brain 
will produce different effects, any one or more 
of which may be crucial with regard to SAE 
report in a given case. Frontal lobe damage, 
especially with involvement of the prefrontal 
region, frequently interferes with the ability 
of Ss to attend or concentrate and greatly 
increases their distractibility (Peale, 1954). 

hus, in many cases the SAE may not be 
Perceived primarily because of the inability 
to maintain fixation. Goldstein (1942) tells 
US also that damage to the frontal lobes e 
Pairs the patient’s ability to take and hol 
the abstract attitude. His methods for the 
Measurement of such impairment would sug: 
Sest that this is closely related to an inant 
to shift psychological set, and it may be t a 
Such patients cannot consciously epn 

e SAE unless something is done to me 
their original set, as Schein (1960) mE be 
Still other cases, damage to the tempat o a 
Producing an _aphasic disturbance coul = 
Sult in an inability to report the a ys 

it is consciously perceived. Both ie ra 
and Bender (1949) and Werner and a = 
(1942) found that unusually short me 
Us intervals were necessary to produce ol 

Movement phenomena in their << al 
Bed Ss, suggesting that some sort of pa 
“hophysiological deficit was occurring In tne 
n egration of discrete visual stimuli. In maay 
ases with diffuse and severe cortical damag 
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the cortical visual system and/or “associa- 
tion” cortex may be so disrupted as to pre- 
vent reception and integration of the stimuli 
impinging upon the retina. 

In the present study, the use of some stimu- 
lus conditions which make SAE perception 
difficult even for normal Ss added a further 
increment of difficulty to those noted above. 
Thus, in addition to the different physiologi- 
cal effects produced by shorter, more slowly 
moving stimuli, the unusually long series of 
trials increased the tendency for physical and 
psychological fatigue to interfere with S’s con- 
centration on the task at hand. These addi- 
tional handicaps for the brain damaged pa- 
tient, whose SAE report may already have 
been impaired by one or more of the factors 
suggested above, probably produced the rela- 
tively clear-cut differentiation of brain dam- 
aged and nonorganic groups in the present 
study. 


SUMMARY 


This study was designed to investigate the 
effects of certain stimulus variables upon per- 
ception of the spiral aftereffect (SAE). A 
920° Archimedes spiral was presented to 50 
patients with cortical brain damage, 50 psy- 
chiatric patients with no known or suspected 
brain damage, and 50 college students in a 
standard series of trials. These trials contained 
all combinations of 18 and 54 rpm rotation 
speeds, 10- and 30-second exposure times, and 
clockwise and counterclockwise rotations. One 
additional trial in each direction was given at 
90 rpm with a 30-second exposure time, ap- 
proximating the conditions used by other in- 
vestigators. 

It was found that during rotation of the 
spiral the patient groups reported apparent 
size and/or depth changes with approximately 
equal frequency. Following rotation, however, 
46 of the 50 NP Control Ss reported the SAE 
6 or more times in 10 trials, while only 6 
of the Cortical Damage Ss did likewise. The 
college students responded almost identically 
with the NP Controls. The atypical psychi- 
atric patients tended to be elderly people, 
raising the possibility of undiagnosed brain 
damage associated with aging, while the atypi- 
cal Cortical Damage Ss were characterized by 
mild and localized damage, with little or no 
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clinical neurological or intellectual impair- 
ment. Although histological evidence was not 
available, the data strongly suggest that lo- 
cation of damage is relatively unmiportant, 
but that extensive or severe damage to any 
part of the cortex markedly reduces the prob- 
ability of SAE report. 

Analyses of variance indicated that ex- 
posure time and rotation speed were signifi- 
cant factors in perception of the SAE for all 
subject groups, while direction of rotation was 
a significant variable only for the nonorganic 
groups. Conditions of medium difficulty dis- 
criminated best between the patient groups, 
while conditions similar to those of previous 
studies were less discriminating, 

It was concluded that multiple factors are 
involved in the disruption of SAE report in 
brain damaged Ss, and that the increment of 
difficulty added by a long series of relatively 
difficult trials contributed to the clear-cut dis- 
crimination between groups in this study. 
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INTELLECTUAL FUNCTIONING IN A GROUP OF 
NORMAL OCTOGENARIANS* 
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This paper presents some empirical findings 
on the intellectual functioning of 50 men who 
Were in service during the Spanish American 
War and who now average 80 years of age. 
These men live in the Greater Boston Area 
and responded to an invitation to attend the 
research oriented Geriatric Clinic established 
at the Boston Veterans Administration Out- 
patient Clinic in 1958. The functions of the 
Clinic as well as social data on these men are 
described elsewhere by Nichols and Cummins 
(in press), 

The questions asked by this study were 
the following: First of all, how would these 
80-year-old, relatively healthy men perform 
on the Wechsler-Bellevue Intelligence Scale 
(Wechsler, 1944) and how would their re- 
Sults compare with those reported in other 
Studies of older people? Secondly, is there 
evidence of intellectual decline as measured 
by this test, and if so, how does this evi- 
dence compare with findings on other age 
8toups; and, is this intellectual decline re- 
lated to level of intelligence as has been some- 
times claimed? Thirdly, do the time limits 
established for the Wechsler-Bellevue scale, 
“orm I, inordinately penalize the slower, 
Older person? 

The Wechsler-Bellevue Form I scale was 
employed rather than the newer oe 

dult Intelligence Scale (Wechsler, 1953) 
Cause there are more data available on Form 


Performance by various pathological a 
and normal age groups with which these data 
Might be compared. 
h An- 
* Based u aper presented at the twelft 
nual Meeting of yee Gerontological Society, et 
troit, Michigan, November 1959, reina E 
ae investigation carried out at the Boston Vetera 
dministration Outpatient Clinic. 
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Subjects. Table 1 presents some descriptive infor- 
mation about the subjects. The Total Group of 50 
men range in age from 73 to 89 years, with mean 
age at 80.4, standard deviation 2.1 years, Their edu- 
cation ranges from no formal education to college 
graduate (16 years). Mean education for the Total 
Group is 7.9 years. Their highest achieved occupa- 
tional status as rated according to Warner’s scale 
(Warner, Mecker, & Eels, 1949) ranges from 1 (the 
highest level, as professional) to 7 (the lowest level, 
as laborer) with a mean occupational level of 4.2. 

For the purpose of intragroup analysis the Total 
Group was divided into Younger and Older Groups. 
The 23 subjects in the Younger Group range in age 
from 73 to 79 years and their mean age is 77.6. The 
men in the Older Group range in age from 80 to 89 
and average 82.8 years, approximately 5 years older 
than the men in the Younger Group. As can be seen 
from Table 1, the two groups are highly similar in 
education and highest achieved occupational level. 

Procedure. Subjects were administered the 11 sub- 
tests of the Wechsler-Bellevue Intelligence Scale, 
Form I, by experienced examiners using standard in- 
structions with the exception of extending time limits 
on all timed tests except Digit Symbol. The method 
is similar to that employed by Doppelt and Wallace 
(1955) and was done in order to compare perform- 
ance under standard (ST) and extended time (ET) 
limits. 

Although the Vocabulary subtest was administered 
it was not considered in determining the Verbal 
weighted scores or IQs in order to make these re- 
sults more comparable with those of other investi- 
gators. The IQs were determined by the extrapola- 
tion technique suggested by Wechsler (1944) for age 
groups beyond those given in his tables. 


RESULTS 


Table 2 presents the Verbal, Performance, 
and Total weighted scores and IQs of the 
Younger, Older, and Total Groups. With re- 
spect to weighted scores, differences between 
groups are slight and nonsignificant; though 
curiously, the Older subjects outdo the 
Younger on the performance scale, where they 
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TABLE 1 


AGE, EDUCATION, AND Occupation OF SUBJECTS 


Age in Years 


Occupational Rating 


Education in Years (Warner’s Scale) 


Group N Range M SD Range M SD Range M 5D 
Younger 23 73-79 77.6 1.6 4-14 8.0 2:2 1-6 4.3 1.3 
Older 27 80-89 82.8 2.6 0-16 7.8 3.5 1-7 4.0 1.5 
Total 50 73-89 80.4 2.1 0-16 7.9 2.9 1-7 4.2 1.4 

TABLE 2 


VERBAL, PERFORMANCE, AND FULL SCALE WEIGHTED 
Scores anp IQs FOR YOUNGER, OLDER, AND 
ToraL Groups 


Older 


Younger Total 

Group Group Group 
Scores M SD M SD M SD; 
Verbal WS 47.2 7.3 441 11.8 45.3 10.1 
Performance WS 326 94 33.1 96 328 9.5 
Full Scale WS 79.1 14.3 78.0 18.1 78.5 16.3 
Verbal IQ 110.5 6.9 109.2 10.8 109.8 9.2 
Performance IQ 110.5 100 113.7 9.9 112.1 10.1 
Full Scale IQ 108.7 8.2 109.9 10.2 109.3 9.3 


might be expected to do worse. This result is 
probably a sampling fluctuation. The superi- 
ority of verbal weighted score over perform- 
ance weighted score is statistically significant 
for both Older and Younger subjects, 

With respect to IQ, there are no significant 
differences between groups. The mean IQ, 
using Wechsler’s extrapolation method, is 109. 

Table 3 presents the means and standard 
deviations of subtest Scores for the subgroups 
and Total Group. The highest subtest scores 
for the Total Group are Vocabulary, Infor- 
mation, and Comprehension, with Average 
weighted scores of 10 or 11. The next lower 
group of scores, averaging about 8 or 9, were 
those of Arithmetic, Similarities, Picture Com- 
pletion, and Object Assembly. The lowest 
grouping of scores were those on Digit Span, 
Block Design, Picture Arrangement, and Digit 
Symbol, averaging 5 or 6. 

It is worthy of note that the Information, 
Comprehension, and Vocabulary scores above 
10 are consistent with the above average IQs 
yielded for the test as a whole by Wechsler’s 
extrapolation method. 


One may examine the pattern of differential 
decline of the various subtests by two means: 
first by comparing the differential scatter of 
subtests which occurred within the two age 
groups, and then by comparing these subjects’ 
scatter of subtest scores with that reported 
by other investigators with aged subjects. 

Figure 1 demonstrates the similarity of sub- 
test scatter for the Younger and Older Groups 
within the sample. First, we note that the 
Younger men score higher than the Older 
men on 8 of the 11 subtests, although none 
of these differences is significant by ¢ test. 
Secondly, we note that the pattern of per- 
formance of the two groups on the various 
subtests is highly similar. This is evidence 
within this particular sample of the reliability 
of the differential decline of subtest perform- 
ance which has been reported for age by 
Wechsler (1953) and others. 


TABLE 3 


MEANS AND STANDARD Deviations OF SUBTEST 
WEIGHTED SCORES FOR Youn ER, OLDER, 
AND TOTAL Groups 

Younger Older Total 
Group Group Group 
Subtest M SD M SD m sD 
Information 11.4 1.5 10. 7 
Comprehension 10.6 24 ion a TA 26 
Digit Span 72 29 64 26 67 26 
Arithmetic 92 25 86 29 89 27 
Similarities 8.8 23 81 35 84 29 
Vocabulary 11.8 25 110 36 113 19 
Pic. Arrangement 62 31 50 25 5.5 2.8 
Pic. Completion Si 3.4 94 3.2 8.7 3.2 
Block Design 58 18 56 25 57 22 
Obj. Assembly 77 30 86 24 81 27 
Digit Symbol 49 17 50 17 49 17 
e at 
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Fic. 1. Mean weighted subtest scores for younger and 
older groups. 


test performance 


Tal r s the sub 
able 4 presents the od ‘with that of 


Of this aged group contrast 
Others eae in tho literature, with See 
ranked in terms of their average yöghte 
Scores. Although these various samples ay 
drawn partly from institutions and = a 
from the community, there seems to be a 
agreement at the ends of the scale. As indi- 
cated by the averages of the ranks across 
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studies, the abilities contributing to perform- 
ance in the Information, Comprehension, and 
Arithmetic subtests are best retained in the 
older years, and those involved in perform- 
ance on the Block Design, Picture Arrange- 
ment, and Digit Symbol subtests are least 
well retained. The major factor involved in 
the last mentioned three is probably speed. 

The four subtests ranked in the middle 
range do not clearly distinguish themselves 
from each other in terms of differential de- 
cline. 

Wechsler developed a method of estimating 
deterioration of functioning by comparing the 
sum of scores of the subtests which have been 
found to decline more slowly with age with 
the sum of scores of those subtests on which 
performance tends to be significantly im- 
paired with age. This so-called deterioration 
was measured in the present group and com- 
pared with Wechsler’s norms for younger age 
groups. The result is shown in Figure 2, 

Using Wechsler’s formula for deterioration 
quotient (or DQ) shown in the lower left- 
hand corner of Figure 2, the average DQ for 
the Younger Group was found to be 70.1 and 
for the Older Group to be 66.5. These are 
plotted along with the DQs for various age 
groups reported by Wechsler. They seem to 
fall directly on an extrapolation of the em- 
pirical curve determined by Wechsler. Thus, 
the same rate of decline of abilities relative to 
each other is occurring in this sample as in 


TABLE 4 


SCORE IN PRESENT STUDY AND OTHERS 


SUBTESTS RANKED BY AVERAGE S = = = = 
i - — Fox & Madonick 
abi akon & Solomon Chesrow Howell Howell Average 
Present ee (1950) (1947) (1949) (1955) (1955) of Ranks 
Study (19% x 
x z © © 3 2 oh a 
Average Age 80 - 65 ne i 
Subtest: * i 1 1 2 1 tá 
nformation 1 l 2 2 z ? a 19 
~Omprehension 2 4 3 5 3 a 
Arithmetic 3 B 4 8 > a 
J€. Completion 4 $ 7 3 a ee 
Similarities 5 b 5 4 j ee 
ject Assembly 6 4 6 6 58 
Digit San ii 7 rs A 7 8 3 3 oe 
Block Design 8 s 9 9 9 9 9 9.0 
ic, Arrangement 9 10 10 10 a oi we aa 
igit Symbol 10 
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TABLE 5 
AND CHANGES LN SCORES ON 


Usinc STANDARD AND 
NDED Time Limits 


Mean Weicuten Sco: 
Four SUBTE: 


Picture 


TS y, Y y y p4 Y S IF 
Cz SPS A SP I Cs ey 


Age Groups 


Fic. 2. Average deterioration quotients at different 
ages—Wechsler’s norms and present results. 


Wechsler’s sample despite the fact that the 
Present sample is of higher average intelli- 
gence. The DQ, as an intratest ratio for each 
individual, is relatively independent of the 
level of subtest scores. 

The relation of deterioration to present 
level of intelligence was tested further by 
correlating DQ with IQ. The correlation was 
-03. DQ was then correlated with two other 
measures which can be considered to be re- 
lated to past functional intelligence, namely, 
education and highest occupational level 
teached. The correlations were —.01 and .01. 
Thus no evidence was found of relationship 
between present or past intellectual level and 
amount of present intellectual decline. 

Now let us consider the effects of using 
standard time limits or extended time limits 
with this older group. 

Table 5 presents the changes in weighted 
Scores on four timed subtests which occurred 
by allowing subjects extra time. Twelve per- 
cent of the subjects increased their Arithmetic 
score an average of 1.5 points, resulting in a 
rise in the group average of only 0.2 of a 


. Arith- Arrange- Block Object 
r metic ment De: Assembly 
ST ET ST ET ST ET ST ET 
Mean 8.9 91 5.5 5.8 5.7 6.6 81 8.6 
SD 27 27 28 28 2.2 25 27 23 
Percentage of ay 
Group Changed 12% 16% 43% 18% 
Average Change 1.5 1.6 Zi 2.8 
tho -96 196 189 89 


point. The change in the Picture Arrangement 
subtest was also slight, as was that in the Ob- 
ject Assembly subtest. However, 43% of the 
subjects improved their Block Design score 
when given extra time. These changes did not 
appreciably disturb the rank-ordering of the 
subjects within the sample as shown by the 
high correlations between scores under stand- 
ard and extended conditions. 

Table 6 presents the changes in IQ which 
resulted from using extended time limits. Only 
12% of the subjects increased their Verbal 
IQ, and the group average raised only slightly 
from 109.8 to 110.1, however, 60% of the 
Performance IQs were raised, bringing the 
average for the group up from 112 to almost 
114, and 61% of the Full Scale IQs were 
raised by allowing extra time, however rais- 
ing the group average but one point. The rank 
correlations between IQ scores under standard 
and extended time conditions were extremely 
high, indicating that these older subjects do 
not appreciably change their ranking in rela- 
tion to each other as a result of having ex- 
tended time limits, 


TABLE 6 


MEAN VERBAL, PERFORMANCE, AND FULL Scare IQs anp CHAN 
USING STANDARD AND EXTENDED Time Lrarrs 


GES IN IQs 


Verbal IQ Performance IQ Full Scale IQ 
ST ET ST ET ST ET 
Mean 109.8 110.1 112.1 113.9 109.3 110.3 
SD 9.2 9.2 10.1 9.8 93 9.3 
Percentage of Group Changed 12% 60% : 61% : 
Average Change 1.5 32 1 7 
rho 98 : 


95 99 
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SUMMARY 


A group of 50 relatively healthy, male vet- 
erans of the Spanish American War, who now 
average 80 years of age, was found to be 
above average in intelligence, performing well 
in tests measuring retention and comprehen- 
sion of verbal material but performing poorly 
in tests affected by psychomotor speed and 
abstract thinking. The pattern of decline of 
various abilities tested is consistent within the 
younger and older men of the sample and 
consistent with other studies of older groups. 
The deterioration quotient for the Total Group 
follows closely an extrapolation of the curve 
of deterioration with age empirically derived 
by Wechsler, Extent of intellectual decline was 
found to be unrelated to level of intelligence 
as measured by IQ, education, or highest oc- 
Cupational level reached. 

The use of standard time limits was found 
to appreciably affect the older person’s score 
on the Block Design subtest, but not the 
Arithmetic, Picture Arrangement, or Object 
Assembly subtests. Standard time limits were 
found to depress the Performance IQ two 
Points and Full Scale IQ one point, but do 
Not appreciably affect older persons’ pan 
rankings within their group as far as 
Scores are concerned. 


REFERENCES 


Cursrow, E. J, Wostka, P. H., & Remiz, A, H. 
A psychometric evaluation of aged white males. 
Geriatrics, 1949, 4, 169-177. 

DorrELT, J., & Warrace, W. Standardization of the 
WAIS for older persons. J. abnorm. soc. Psychol., 
1955, 51, 312-330. 

Fox, C., & Brrren, J. E. The differential decline of 
subtest scores on the Wechsler-Bellevue Intelligence 
Scale in 60-69-year-old individuals. J. genet. Psy- 
chol., 1950, 77, 313-317. 

Howe t, R. J. Changes in Wechsler subtest scores 
with age. J. consult. Psychol., 1955, 19, 47-50. 
Maponicx, M. J., & Soromon, M. The Wechsler- 
Bellevue scale in individuals past sixty. Geriatrics, 

1947, 2, 34-40. 

Nicnots, M. R., & Cummins, J. F. The lifelong so- 
cial adjustment of a group of normal octogenarians. 
Geriatrics, in press. 

RABIN, A. I. Psychometric trends in senility and psy- 
choses of the senium. J. gen. Psychol., 1945, 32, 
149-162. 

Warner, W. L., MEEKER, M. & Eers, K. Social class 
in America. Chicago: Science Research Associates, 
1949. 

Wecuster, D. Measurement of adult intelligence. 
(3rd ed.) Baltimore: Williams & Wilkins, 1944. 
Wecuster, D. The measurement and appraisal of 
adult intelligence. (4th ed.) Baltimore: Williams & 


Wilkins, 1953. 
(Received March 10, 1960) 


Journal of Consulting Psychology 
1961, Vol. 25, No. 2, 142-145 


PREDICTION OF RELAPSE FOR PSYCHIATRIC PATIENTS 


DAVID J. GOUWS: 
Western Psychiatric Institute and Clinic 


In a review of the literature on prognostic 
measures in psychopathology Windle (1952) 
has commented favorably on Feldman’s 
(1951) Ps scale. This scale was developed to 
predict the probability of success of shock 
treatment and consists of 52 items derived 
from the Minnesota Multiphasic Personality 
Inventory. Windle (1952) suggested that 
“further work in prognosis should employ 
this scale, or just as desirable, previously 
gathered data should be reanalyzed with it” 
(p. 466). Only two reports on cross-valida- 
tion studies of the Ps scale were located 
(Pumroy & Kogan, 1958; Roberts, 1959). 
Both failed to confirm Feldman’s own cross- 
validation findings. This paper, which reports 
a reanalysis of previously gathered data, 
presents further validational evidence for, and 


aa possible interpretations of, the Ps 
scale, 


SAMPLE DESCRIPTION 


A search of the files yielded complete MMPI 


records for 104 inpatients of the Western Psychi- 
atric Institute and Clinic, tested between 1944 and 
1951, but mostly in the period 1946-47. Of these, 
6 were excluded on the basis of major organic 
involvement, and 4 because they had obtained a 
? raw score in excess of 100. This left 94 cases 
with usable MMPI records, 
i As the reasons for the Original test referrals were 
in many cases unknown, these 94 patients could not 
be regarded as an unselected sample of the hospital 
population at that time, However, since any bias 
in this sample Presumably would not affect the 
range of variation of either the predictors or the 
criterion used, the possible unrepresentativeness of 
the sample was not regarded as a limitation. 
Feldman (1951) listed several criteria which have 
to be met before his Ps scale can be validly em- 
ployed. Of the 94 cases, 60 met all these require- 
ments, while the other 34 failed to meet only the 


1 On leave from the University of Pretoria, South 
Africa. It is a pleasure to acknowledge the help of 
Edith Fleming in obtaining the follow-up data on 
the patients used in this study. 


requirement of elevated scores on one or more of 
the critical scales. These groups will subsequently 
be referred to as the “elevated” and “unelevated 
groups, respectively. 

The following dichotomous “criterion of improve- 
ment” was used in this study: “Improved” cases 
were all those who obtained an “improved” rating 
on leaving hospital and who were not readmitted 
for psychiatric reasons to this or to another hospital 
in the 5 years following discharge. “Unimproved 
cases were those rated “unimproved” on leaving 
hospital (usually for transfer to another institution), 
as well as those cases with an improved rating upon 
leaving hospital, but with a record of a subsequent 
relapse severe enough to require psychiatric rehospi- 
talization in the 5 years following discharge, (The 
shortest period of such rehospitalization was 3 
months). The information on rehospitalization of 
former patients had been gathered at 6-month 
intervals from the referring hospitals and physicians, 
and is complete, so far as is known, except for 
some patients who may have been hospitalized in 
another state without mentioning their previous 
hospitalization here. 

Of the 60 patients in the elevated group 23 had 
received either or both insulin and electric shock 
treatment. This treatment started from one day to 
several months after the administration of the 
MMPI. The 11 patients subsequently classified as 
improved in terms of the criterion used received 
an average of 11 electrically induced convulsions 
and 8 insulin comas each, The corresponding 
figures for the unimproved group of 12 patients 
were 14 and 16. The remaining 37 patients in the 
elevated group received general supportive hospital 
care, which often included an appreciable amount 
of individual psychotherapy, The composition © 
the various Subgroups in terms of diagnosis, s¢%s 
and age, is shown in Table 1, j 


PROCEDURE AND RzrsuLTS 


Attempting to determine what is measured 
by the Ps scale, Feldman (1951) has pointed 
out that an unusually large number © 
items which refer to interpersonal relation- 
ships are included in his Ps scale, and that 
his “recovered” group, although consisting 
almost entirely of psychotic patients, 0b- 
tained scores very similar to that of a group 
of normal subjects. Since quality of inter- 
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TABLE 1 


COMPOSITION OF SAMPLE 


Diagnosis and Sex 


Affective Schizo- Nonpsy- 
Age Disorder phrenia® chosis> 

Subgroup N Mean SD: M 2 u a m 2 
Elevated, shock treatment: 

Improved 11 35.0 8.8 1 l : z 

Unimproved 12 oe ii ý 9 $ . i á 
Elevated, supportive care: 

Improved 20 33.4 11.3 1 1 2 3 4 9 

Unimproved 17 29.9 10.9 1 3 3 2 6 
Unelevated, all treatments: 

sý 19 36.3 1 3 2 2 3 5 4 

Improved is 35.5 3 2 1 4 3 2 


ay 
nimproved 


a ee 


* Including two s of undiagnosed 
© Psychoneuroses and personality disorders. 


bersonal relationships enters into most con- 
cepts of adjustment, it may be asked 
Whether an “adjustment” questionnaire might 
not differentiate equally well between pa- 
tients that improve and those that do not. 
Furthermore, since 38 out of the 52 items 
Comprising the Ps scale tended to be m 
Swered “True” by Feldman’s umimiproye 
criterion group, there may also be a oe ion 
about the role of response set, specifically o! 
“acquiescence,” and about the possibility © 
Using an acquiescence measure as a sip 
Pressor variable in predicting improyemen 
AS scores on a 142-item Adjustment key for 
the MMPI (Fulkerson, 1957) as wil m 
& 24-item Acquiescence key, also Pried 
tom the MMPI (Fulkerson, 1958) wen 
available for the patients in this sample; 
e ese questions could be followed up at sn 

“Asche Jevated group was 
roved cases 
jon described) and 
Acquiescence scores 


W ; in Table 2 show 
teat Potpaned. Tae ond improved and 


improved patients in the case of all three 
ariables, 
Although the different fol 


Sed precluded a strict comparison of 


low-up criteria 
the 


Ps distributions obtained in the present 
investigation with those obtained by Feld- 
man (1951), some interesting similarities did 
appear. The mean Ps score of the improved 
group in this study is about halfway between 
the means reported for Feldman’s recovered 
and improved test groups, while the present 


TABLE 2 


QUESTIONNAIRE SCORES OF THE TOTAL 
ELEVATED GROUP 


Questionnaire Score 


Criterion 


Status N Mean SD g Fois 
Ps 
Improved 31 21.2 84 
2.68** 41 
Unimproved 29 27.2 89 
Adjustment 
Improved 31 559 17.8 
247* 38 
Unimproved 29 66.8 16.4 
Acquiescence 
Improved at N5 31 
2.73* 42 
Unimproved 29 140 39 


* p <01. 4 s : p 
*#* p <.005 (one-tailed ¢ tests in the case of Ps and Adjust- 
ment). 
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unimproved group obtained a slightly lower 
(more favorable) mean score than his un- 
improved group. The overlap between the 
Ps score distributions of the improved and 
unimproved groups in the present study is 
somewhat greater than that found by Feld- 
man. Inspection reveals that a cutting score 
of 24.5 will maximize the total correct 
predictions of the criterion in the present 
sample. The analogous cutting scores for 
Feldman’s distributions (maximizing total 
correct prediction of his criterion) range be- 
tween 20.5 and 28.5, depending on how the 
dichotomy is obtained. 

The intercorrelations between the Ps, Ad- 
justment and Acquiescence scores of the 
60 patients in the elevated group were: 
Ps-Adjustment, + = .84; Ps-Acquiescence, r 
= .65; and Adjustment-Acquiescence, 7 = .53. 

The Ps-Adjustment correlation coefficient 
is of the same order as the corrected split- 
half reliability coefficient of .86 reported 
for the Ps by Feldman (1951). With only 
12 items common to these two scales, which 
were derived under widely different condi- 
tions, such a high correlation is remarkable. 
The high acquiescence loadings (if we assume 
for the moment the validity of the Acqui- 
escence scale used) of both the Ps and the 
Adjustment scales are according to expecta- 
tion. | Unfortunately, the fact that the 
Acquiescence scale discriminates as well as 
do the other two between the improved and 
unimproved patients, rules out the possibility 
of using it as a suppressor variable. 

It remained to be seen whether prediction 
of relapse was equally effective for the group 
who had received general supportive hospital 
care as for the patients who had received 
insulin and/or electric shock treatment. 
Splitting the elevated group into supportive 
care and shock treatment subgroups, it was 
found that the Ps scale discriminated equally 
well (p < .025) between improved and un- 
improved patients in both treatment sub- 
groups. This suggests that what the Ps 
scale measures is not so much ability to 
benefit from shock treatment as the tendency 
to get well irrespective of type of treatment. 
Incidentally, the shock treatment subgroup 


obtained a slightly lower (more favorable) 
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mean Ps score (.1 > p> .05) than the sup- 
portive care subgroup. 

A practical limitation of Feldman’s scale 
is that it can only be used with patients 
whose MMPI profiles meet the stated re- 
quirements. In the total sample of 94 pa- 
tients available for this study, 34 (or 36%), 
are thus excluded from consideration. It 
seemed worthwhile to investigate whether the 
relationship observed between Ps scores and 
improvement in the elevated group would 
not be found in the unelevated group as 
well. Comparison of the Ps scores of the 
improved and unimproved patients in the 
unelevated subgroup yielded no significant 
difference, however, neither did the Adjust- 
ment or Acquiescence scores differ signifi- 
cantly, although all three differences were in 
the same direction as for the elevated group. 
To estimate the influence of truncation of 
scores on the validity of the Ps scale, biserial 
y was calculated and corrected for restriction 
of range. The r obtained was .22 and it rose 
to .31 (p> .05) after correction. 

Feldman has speculated about the type 
of patient who is ill enough to be in a 
psychiatric hospital, yet responds essentially 
like a normal person to psychological ques- 
tionnaires. Two incidental but interesting 
observations on the unelevated subgroup in 
this study should be reported: Of 8 patients 
diagnosed as “psychopathic personality” or 
“psychopathic state” in the original sample 
of 104 cases, 5 were in the unelevated sub- 
group. Secondly, 12 patients out of the 34 
in the unelevated subgroup attended college; 
and 8 of them graduated. The corresponding 
numbers for the elevated subgroup (N = 60); 
were 9 and 2, These differences, tested bY 
xX, with Yates’ correction, were significant 
at the .05 and .01 level for college attendance 
and graduation, respectively, 


Discussion 
, Although a significant difference betwee” 
improved and unimproved patients wa 
found, 


the differentiation in terms of PS 
Scores was not marked enough to enable 
reliable prediction for the individual patient 
to be made except in a small minority ° 
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extreme cases. In evaluating the less im- 
pressive discrimination obtained—as com- 
pared with the data reported by Feldman 
(1951)—the difference between the improve- 
ment criteria used in the two studies, as well 
as the long period that elapsed between 
taking the MMPI and starting shock therapy 
in the case of some patients in this study, 
should be taken into account. Whether this 
latter point made any real difference is 
questionable, as the prediction was shown to 
hold irrespective of treatment received. It 
does suggest that the attribute (s) tapped are 
reasonably stable in time. 

Feldman (1951), discussing his own cross- 
validation findings, has suggested that the 
Ps scale measures “propensity to improve” 
irrespective of diagnosis or method of treat- 
Ment. The present findings, that the Ps 
Scale predicts improvement equally well for 
shock treatment and for supportive care pa- 
tients, and that an Adjustment scale, de- 
veloped for a different purpose, can predict 
as well as the Ps scale, do seem to confirm 

notion that a general characteristic oF 
group of characteristics, rather than a specific 
characteristic, namely, responsiveness to 
Shock treatment, is being measured by these 
Scales. That an acquiescence measure, the 
items of which were chosen so aS not to 
discriminate between well and poorly ad- 
justed military personnel, predicts improve 
Ment so well, indicates that what Feldman 
as tentatively labeled “propensity to 1m- 
Prove” may be a complex entity. 


SUMMARY 


In a cross-validation study Feldman’s Ps 
MMPI scale—for the prediction of response 
to shock treatment—was found to discrimi- 
nate significantly between patients who had, 
and did not have, a relapse within 5 years, 
irrespective of treatment. An Adjustment 
scale, developed for military personnel, dif- 
ferentiated equally well between the criterion 
groups. These two scales intercorrelated .84, 
and, respectively, correlated .65 and .53 with 
an Acquiescence scale. The possible use of 
the Acquiescence scale as a suppressor vari- 
able was explored and the implications of 
these data for prognosis of psychiatric 
patients discussed. 
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THE REPRESENTATION OF PHYSIQUE IN CHILDREN’S 
FIGURE DRAWINGS 


A. B. SILVERSTEIN 1 
Pacific State Hospital 


The interpretation of human figure draw- 
ings as a projective technique is said to rest 
on the assumption that the drawn figure 
represents the subject’s body image—“the 
picture of his body which he forms in his 
mind” (Schilder, 1950). Much of the re- 
search purporting to test the “body image” 
hypothesis has made use of physically dis- 
abled subjects; investigations of this kind 
were reviewed in a previous paper (Silver- 
stein & Robinson, 1956). A search of the 
literature has revealed but little work based 
on subjects within the normal range of 
physical variation. Berman & Laffal (1953) 
reported that the predominant somatotype 
of drawn figures was related to that of the 
men who drew them;* and Kotkov and 
Goodman (1953) found that figures drawn 
by obese women tended to cover a greater 


area of the page than those drawn by women 
of ideal weight. 


In a review of e 
figure drawings, Swen 
the treatment and int 
of both of these stu 


mpirical evidence on 
sen (1957) questioned 
€rpretation of the data 


0 e dies, but even if their 
seemingly positive results are accepted at 


face value, it should be noted that neither 
study provided a direct test of the body 
image hypothesis. The same is true of investi- 
gations of the figure drawings of the physi- 
cally handicapped. When previous research is 
considered from an operational viewpoint, it 
is immediately apparent that its focus has 
been the relation between the drawn figure 


1 Formerly at the Psychiatric Institute, University 
of Maryland, where the data for this study were 
collected. 

2 As reanalyzed by the present authors, the data 
presented by Berman and Laffal do not show a 
significant relation between the somatotype of the 
drawn figures and that of the subjects (x? = 6.53, 
df=4, p> 05). 
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and the actual structure of the body, not the 


body image. Subjects have been selected not 


on the basis of differences in body image, but 
because they differed with respect to actual 
physique.* While it may be true that nor- 
mally there is no discrepancy between the 
body image and the body structure there 
seems to be no empirical evidence at present 
to support this common assumption. To the 
extent that the subject’s “mental picture 
of his physique does not correspond to his 
actual physique, the relations and differences 
observed in previous research on the body 
image hypothesis are clearly in error. 

To the writers’ knowledge, the study re- 
ported here is the first to distinguish opera- 
tionally between body image and body 
structure. The method made it possible t0 
assess the degree of correspondence betwee? 
body image and body structure, and to pel 
form a direct test of the body image hy- 
pothesis, i.e., to relate the drawn figure tO 
an independent measure of body image. Since 
Buck (1948) and others have suggested that 
the drawn figure may represent the subject $ 
“body ideal” as well as or instead of his 
body image, a measure of this construct wa5 
also included in the study. 


PROCEDURE 

The subjects were 30 boys 
from a total sample of 97 sixth grade public gonn 
children so as to equate for age (mean 11-7, rang 


e 
11-2 to 12-2), The conventional drawing procedu" 
was followed during a regular class session. 
children were first 


le 
asked to draw a person—a whe 
Person, and then a person of the opposite 5° 
——______. 


* Silverstein and K 
perimental test of th 


acted 
and 30 girls, selecte 


lee (1958) conducted an E 
he body image hypothesis | 
which body Structure was not the basis for selec 
ing subjects, but in this study, too, no independe” 
measure of body image was employed. 
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from the first. No time limits were set for the 
task, and no further instructions were given. 

When the drawings had been completed, a brief 
questionnaire was administered. To obtain measures 
of body image which would be experimentally in- 
dependent of measures of body structure, the chil- 
dren were asked to estimate their height and weight. 
For measures of body ideal, they were asked to 
state how tall they would like to be and how 
much they would like to weigh (at the present 
time) if their height and weight could be changed. 
Finally, the children were weighed and their heights 
measured. 

The height of each of the drawn figures was 
measured to the nearest 0.1 inch; and an estimate 
of its volume—the height of the figure multiplied 
by the square of its width at the waistline (also 
Measured to the nearest 0.1 inch)—was taken to 
represent its “weight.” 


RESULTS 


The first step in analyzing the data was to 
assess the degree of correspondence between 
body image and body structure. For this 


Purpose, Pearson product-moment correla- 
tions were calculated between estimated and 
e of the 


actual measures, and the significanc 
differences between the means of these meas- 
ures was evaluated using the ¢ test for cor- 
related measures. The findings are shown in 
Table 1.4 


TABLE 1 


(N = 60) 


Measure Mean 


ia os 
Weaht in inch 58.3 
‘ght in pounds 92.0 


* P 
s o'8nificant at 01 level. 
Significant at .001 level. 


correlation coeffi- 
ose correspondence 
body structure as 
e concerned, the 
Jarly close in the 
relations are not 
t least the pos- 
al measures 
measures of 


The magnitude of the 
cients indicates a rather cl 
etween body image and 
ar as height and weight at 
Correspondence being particu 
Case of weight. Since the cor 
Perfect, however, there is a 
ibility that estimated and actu 
May be differentially related to 
te fa 1 girls were initi 
an nike coals “for the two 


virtually identi so they have H 
i y identical, and 50 y e 
™ the interest of economy of presentation. 


ally analyzed 
groups were 
peen combined 
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TABLE 2 


CORRELATIONS BETWEEN ESTIMATED, ACTUAL, AND 
IDEAL MEASURES, AND MEASURES OF SAME-SEX 
AND OPPOSITE-SEX FIGURES 


(N = 60) 
Figure Estimated Actual Ideal 
Height 
Same-Sex —33* —.22 —.09 
Opposite-Sex —.21 = 11 O04 
Weight 
Same-Sex —.27* =,23 —.19 
Opposite-Sex —.13 =A2 —.18 


* Significant at .05 level. 
** Significant at .01 level. 


the drawn figures (McCornack, 1956). The 
data of Table 1 reveal a significant tendency 
for the children to underestimate their height 
and weight. The most parsimonious interpre- 
tation of this finding appears to be that in 
this period of rapid growth, the children’s 
knowledge of their height and weight is soon 
outdated. 

The correlations between estimated, actual, 
and ideal measures, on the one hand, and 
measures of the same-sex and opposite-sex 
figures, on the other, are given in Table 2. 
The two coefficients which reach the conven- 
tional criteria of statistical significance repre- 
sent correlations between estimated height 
and weight, and the estimated height and 
weight of the same-sex figure. Contrary to 
expectations, however, both of these coeffi- 
cients are negative, a finding which at face 
value suggests an inverse relation between 
body image and the drawn human figure! 


DISCUSSION 


It is not at all clear why the subject’s 
mental picture of his physique should be 
inversely related to the physique of his drawn 
figure. We are reluctant to invoke such 
concepts as compensation, reaction forma- 
tion, or contrast projection, for there seems 
to be no theoretical basis for attempting such 
“dynamic” interpretations of the present re- 
sults; nor do we possess additional informa- 
tion on the subjects of this study which 
might provide independent support for inter- 
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pretations of this kind. Under these circum- 
stances, we prefer to _ offer no ad hoc 
explanation for the findings that estimated 
measures proved to be somewhat better pre- 
dictors than did actual measures, and that 
the relations observed held for same-sex but 
not for opposite-sex figures. Whatever the 
interpretations of the data, it is clear that 
they are not consistent with a body image 
hypothesis which calls for a direct represen- 


tation of the body image in human figure 
drawings. 


SUMMARY AND CONCLUSIONS 


Human figure drawings were obtained 
from 60 sixth grade boys and girls, after 
which they were given a questionnaire de- 
signed to provide measures of body image 
and body ideal. Finally, the children were 
weighed and measured. Estimated height and 
weight were highly correlated with actual 
height and weight, indicating a close corre- 
spondence between body image and body 
structure. When actual, estimated, and ideal 
measures were correlated with corresponding 
measures of the drawn figures, small but 
significant negative correlations were ob- 
tained between estimated measures and 
measures of the figures, None of the other 


A. B. Silverstein and H. A. Robinson 


ed 


‘correlations was significant. Without more 
; data than are presently available, these find- 
ings ‘are difficult. :to interpret, but in any 
case they are not consistent with the assump- 
tion that the drawn figure directly represents 
the subject’s body image. 
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[RAINING AND FIRST GRADERS’ 


ACHIEVEMENT 


< 


Psy ach regarding parental 
attitudes . ` vearing patterns has been 
mainly concer. vith their general effects on 
Personality develupment of the child. Studies 
of specific effects of parent-child interaction 
on the child’s academic achievement are rarer, 
although one might suppose that if varia- 
tions in parent attitudes and practices pro- 
duce variation in child personality, these 
variations would be reflected in—and might 
in part account for—variations in school 
achievement. 

Results of existing studies of 
Parental variables on school performance 
Suggest, but not unequivocally, that cae 
tions in parent-child relationships are relate 
to deviations in school achievement (Hatt- 
Wick & Stowell, 1936; Kurtz & Swen on, 
1951; Levy, 1933, 1943). These stu ih 
however are based upon situations Ww ba 
there are extremes in either child or re 
behavior and reveal little abon pa migh 

e true of more typical situations. M 

Hypotheses ited by McClelland (i950) 
regarding the role of early tage ` 
formation of the achievement Ee z 
gest relationships between certam TEN 
Parental behavior and the child’s mo eee 
toward his school performance. dente 
(1958), in a study within the ee oy 
framework, found that earlier dema aa 
Mothers for independence behaviors — 


influence of 


: n 

lated to higher need eee find 
* ? 

Year-old boys. She did ab in actual 


S 
ifferences between her eo Biy teacher’s 


‘ se 
Chool achievement as asses as it was on 


ratings, Her study focused 
. | A ide neces- 
achievement fantasy, did not provide i 


i ri- 

Sary controls of the other i sae 

bles which might affect actual ac soe 
n a recent study of mothers 0) 
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children, Gordon (1959) explored the rela- 
tion of mothers’ independence training atti- 
tudes to disparities between the children’s 
intellectual ability and the extent to which 
they had actually accomplished certain de- 
velopmental tasks. He found that mothers 
of high-potential—accomplishment- disparity 
deaf children favored earlier independence 
training more than did mothers of low 
disparity children. 

d'Heurle, Mellinger, and Haggard (1959) 
in a study of personality, intellectual, and 
achievement patterns of gifted third grade 
children found small positive correlations be- 
tween overprotectiveness of parents and 
arithmetic, reading, and general achievement 
scores. They also found a positive relation- 
ship between parental pressures- toward 
achievement and achievement test scores. 

The present investigation asked whether 
within a group of first grade children, who 
were not retarded academically nor disturbed 
emotionally, were individual differences in 
school progress related to differences in 
mothers’ attitude toward independence train- 
ing? The study was performed as part of a 
larger assessment program of first grade 
children.t Medical and psychiatric data, as 
well as psychological and achievement test 
results were available for each child. School 
records, the supervisor of instruction, teach- 
ers, and the physician supplied general in- 
formation about and impressions of home 
situations. In addition, for the children of the 


1 These data were collected while the author was 
at the University of North Carolina. The author 
wishes to express her appreciation to Bernice Wade, 
Supervisor of Instruction for the Chapel Hill Public 
Schools, to the teachers, and to Kempton Jones 
and John Filley whose cooperation made this study 
possible. 
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total sample who are included in the present 
investigation, a questionnaire filled out by 
the mother regarding her age expectations for 
achievement of independence behaviors was 
obtained. The Winterbottom (1958) ques- 
tionnaire was adapted for this purpose. 


METHOD 


The children in the first grade class at School A 
numbered 13 girls and 17 boys; the children at 
School B, 19 girls and 16 boys. Only children enter- 
ing first grade for the first time were included in 
the study. Of the 65 children available for study, 
37 were from homes where one or both parents 
were engaged in occupations related to the local 
university. Parents of the remaining children were 
engaged in occupations ranging from professions 
to owners or managers of small businesses and 
skilled workers. 

All children were administered the Revised 
Stanford Binet Scale, Form L, between December 1 
and April 1 of the school year. In May of that 
year, both classes were given the Stanford Achieve- 
ment Test, Primary Battery, Form N. The psychi- 
atrist interviewed each teacher about each child 
in her class in order to arrive at a psychiatric 
evaluation, The physician compiled avai 
to obtain a picture of t 
rent physical status o 
described all of her chi 


ioral checklist, il, the present investigator 


he children being studied 


ence training questio 
a similar type devi 
ing letter told mo 


nnaire even if they 
y item. They were 
of the questionnaire 


c hich they expected their 
children to do what the item described and also 


to check the items they felt were especially im- 
portant goals of their child rearing.2 

The covering letter invited 
investigator if they 
the questionnaire, Of 


mothers to phone the 
needed further clarification of 
the five mothers who called, 
in only one instance did there seem to be genuine 
confusion about the questionnaire itself. The others 
seemed merely curious or in need of reassurance, 
Of 65 questionnaires sent, 52 were returned, The 
results reported here are based on data from these 
52 complete cases. 


2 Verbatim copies of the instructions and cover- 
ing letter are available from the author on request, 
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TABLE 1 
AND STANDARD DE 
ACHIEVEMENT, AND INDE 
MEASURES FoR ALL SUBJECTS 


Measure Group N Mean SD 
i g 12.89 
e Quotient 
Sta > 8.77 
Stanford. ra 
28 
(Stanford Achievement Boys 
Test, Primary Battery, Both 4 
Form N) 
Arithmetic Achievement Girls a 
nford Achievement Boys 0. 
, Primary Battery, Both a 
Form N) 
Revised Independence 


ng Questionnaire Boys 
Both 


TREATMENT or Data AND RESULTS 


Table 1 presents a summary of means and 
standard deviations of the measures available 
for each of the 52 subjects. No child in this 
group had any marked physical or sensory 
handicap, nor were any of these children 
from homes broken by death or divorce. 
Observation and psychiatric evaluation sug- 
gested that 15 children in the total group 
of 65 showed some mild degree of personality 
disturbance. Two of these children are in- 
cluded in the 52 of this study—both appear 
in the early independence training subgroup. 

Examination of Table 1 reveals that the 
group studied was superior in both intel- 
lectual ability and school achievement. Tests 
of mean differences in intelligence scores; 
reading and arithmetic achievement scores; 
and independence training scores indicated 
no significant differences between children in 
the two different schools, hence the two 
school groups are combined in Table 1 and 
in all further analyses of the data, 
ieee e items had been added to 
ran a interbottom’s independence 
isis Aes tonnaire, an item analysis of the 
were dnd aA performed, Marien 
laia fräinin w, oe favorable to early a 
aed ite the basis of their average e 
4a, lede wate the entire sao io 
enti: ee Of all ages given divided by th 

items answered. (Not all mothers 
answered all items; all answered at least 23 


of the 28.) Age estimates for each item give? 
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TABLE 2 


BY ALL MOTHERS TO 


or THE INDEPENDENCE 
SE BETWEEN MOTHERS 


IANS OF AGE ESTIMATES GIV 
TIONNAIRE AND SIGNIFICANCES OF DIFFERENCES 
FAVORABLE TO EARLY AND THOSE FAVORABLE TO LATE TRAINING 


RANGES anp ME 
TRAINING Qu 


Range of Age Median Age Value* of 
Item Estimates Estimate x? 
1. To stand up for his own rights 2- 6 3 36 
with other children 
2. To know his way around his part 2-10 5 2.84* 
of the community so he can play 
where he wants without getting 
lost 
3. To go outside to play when he 2- 6 3 71 
Wants to be noisy or boisterous 
4. To be willing to try new things 1-8 4 21.33**** 
on his own without depending 
on his mother for help 
3.7 P s m pe 
D be active and energetic in 1-8 4 4.06 
climbing, jumping, and sports 
6T h E ost 
% To show pride in his own ability 1-10 4 3.98 
. to do things well 
< To ig a... a , - 7 2.95* 
To take part in his parents 3-17 
3 Interests and conversations paa 
k P - 6 5. 
To try hard things for himself 3-11 
; Without asking for help F — 
mT A - 5 3 
à = be able to eat alone without 1-9 
help in cutting and handling food 
10. T F i 3-10 5 16.487 
© be able to lead other children 
ee be able to assert himsel 
n childre 
hildren’s groups wok å gj 


+ To make his own friends among 


children of his age 
19, a, ene 3-8 6 a 
ig ess up his own clothes and 
look af is ow sions 
13, ne after his own paia 5-12 6 76 
© do well in school on his own 30 
14, T 3- 8 6 i 
Por able to undress and to 89 
Di z hi ol 
i; i ed by himself : 2-10 5 0.620 
© have interests and hobbies 
x his own—be able to entertain 
himself A 
16, a , 621 10 1.80 
È © earn his own spending money 38 6 2.84* 
i g do some regular tasks around 
ig: — house 5-16 10 8.39*** 
ing be able to stay at home dur- 
8 the day alone 3-15 9 13.20**** 


Se eS 


© To make for himself decisions 


s or how 


ike A 5 
e choosing his clothes V 
s, hobbies, 


t 
ky Spend money for toy’ 
€creations, etc. 
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TABLE 2—(Continued) 


Range of 
Item 


Age 
Estimates 


Value* of 
£ 


Median Age 
Estimate 


20. To do well in competition with 
other children—to try hard to 
come out on top in games and 


sports 


1-10 


21. To be satisfied to stay with 
someone he knows well when 
parents must be away for a few 
days 


1- 6 


22. To decide upon and to purchase 


small gifts with his own money 
for family members and close 
friends 


23. To hold short conversations with 


grown-up friends who come to 
visit the family 


4-10 


24. To visit and to stay overnight 


with a playmate 


. To straighten out most of his 
difficulties with other children 
without adult intervention 


. To be interested in obtaining 6-12 
good grades in school 

27. To take part in group activities 

such as clubs, scouts, etc, 

To read a sim 


by himself 


6-11 


28, 


ple story or comics 6-8 


Š 15.75**** 


3 09 


7 10.48*** 


5 4.63** 


6 3.37* 


3.96** 


8 5.48** 


all valtes of chi square were computed with Y; 


rgo: 
oe 5 S001. 

by mothers favorabl 
then classified as ab 
age estimate given 
item. Agi 
able to | 
Significa: 


e to early training were 
ove or below the median 
í by all mothers for that 
e estimates given by mothers favor- 
ate training were similarly classified. 
nces of differences in responses to 
items by the two groups of mothers were 
tested by means of 2 X 2 chi squares. The 
complete list of the independence training 
questionnaire items is given in Table 2, along 
with ranges and median age estimates for the 
total group of mothers. Table 2 also con- 
tains the values of chi square and the sig- 
nificance levels of those items answered 
differently by mothers favorable to early 
independence training and by those favorable 
to late. Rescoring each mother’s question- 
naire using only the responses given to the 


ates correction for continuity. 


19 items showing significant differentiation 
between the two groups of mothers a revise 
independence training score (RIT) was ob- 
tained. The RIT scores summarized i? 
Table 1 are average age of demand scores, 
i.e., the sum of estimates given divided bY 
the number out of the 19 items responded t0- 

In order to investigate the effects of in 
dependence training on achievement it was 
necessary to hold constant differences 1" 
intellectual ability. To do this, the distrib¥ 
tions of intelligence scores and of readin 
and arithmetic achievement scores were COM 
verted into ranks and thence into standa" 
rank Scores. Using these converted scores; 4 


disparity Score was computed for each chi 
in reading and in arithmetic. That is, t 
differen 


ce between his ranking in the tot? 


-s 
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group on the basis of each of his achievement 
scores and his ranking in the group on the 
intelligence measure was obtained. Tf a 
child’s intelligence rank and achievement 
rank were equal his score would be zero. 
If his intelligence rank exceeded his achieve- 
ment rank his score would be negative, or 
if his achievement rank exceeded his intel- 
ligence rank his score would be positive. In 
order to remove the negative scores a COn- 
stant of 30 was added to the scores sum- 
marized in Table 3. In Table 3, therefore, 
scores below 30 indicate achievement rank 
below intelligence rank, while scores above 30 
indicate the converse. 

Subjects were then divided into early and 
late independence training groups on the 
basis of mothers’ scores on the RIT question- 
naire and mean differences in disparity 
scores were tested using Fisher's ?. The 
analyses were done first for girls and boys 
Separately and then for combined groups. 
The results of these tests appear in Table 3. 

Differences between early and late inde- 
Pendence training groups are statistically 
significant in every case and in a direction 
Suggesting that children whose mothers favor 
earlier demands for independence make poorer 
School progress relative to their intelligence 
level than children whose mothers favor later 
independence demands. The differences ap- 
Pear, at least superficially, to be more marked 
in girls than in boys and more marked in 


reading than arithmetic. 


TABLE 3 


Disparity BETWEEN GROUP RANKING IN INTELLIGENCE 


D LY 
CORES AND RANKING IN ACHIEVEMENT a me 
AND LATE INDEPENDENCE TRAINING GROUPS 
SS = 
ly Group Late Group 
a oft 
Dwsin Mas SD! Nee Value 
Reading : w 
Gire (24) 20,50 998 3850 gos 281% 
cys 25. 998 3529 806 876e 
Both 63 AE gat 30.77 >07 1 
Arithmeti 
stic 
o 322% 
Gitte (04) 19.33 1633. 31:0 79 3230 
Boys (2 933 1037 3514 295 410 
Both E Bo-C 1429" 362i 6.6 
* 
b <.02 
te E 
tory P <01. 
b <.001, 
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on 


Tests of the frequency of items checked by 
mothers on the questionnaire as specially im- 
portant to them failed to show any significant 
relation to other measures. Similarly an hy- 
pothesis that early and late training groups 
might contain differential numbers of chil- 
dren from university-related homes was not 
substantiated. 


DISCUSSION 


While the findings of this study confirm 
those of Gordon (1959) with mothers of 
deaf children, they are contrary to those 
which might be expected from Winterbot- 
tom’s study (1958) and from McClelland’s 
(1958) hypotheses about age of independ- 
ence training and n Achievement. However, 
this study concerns actual achievement, while 
Winterbottom and McClelland are concerned 
with motivation to achieve. The multitude 
of variables which operate and interact to 
produce differences in actual achievement are 
exceedingly complex. However, the fact that 
mothers’ attitudes toward earliness of inde- 
pendence training and actual achievement are 
inversely related in these findings where 
healthy, psychologically sound youngsters ë 
from a fairly homogeneous social group were 
studied suggests that relations among ma- 
ternal attitudes toward independence train- 
ing, n Achievement, and actual achievement 
need to be more extensively investigated. 

Tf one may generalize from this group of 
“normal” children to children referred to 
clinics for academic difficulties, and if one 
assumes that very early demands for inde- 
pendence behaviors by the mother may be 
experienced by the child as excessive pres- 
sure, the findings here confirm the common 
clinical hypothesis that school failure is a 
means by which the child can express 
resistance to the parent (Vorhaus, 1946). 

Examination of the independence training 
questionnaire from the point of view of what 
other than attitudes toward independence 


8 Reanalysis of the data, excluding the two cases 
of “maladjusted” children from the early training 
group, failed to alter the results significantly. One, 
a girl, with an IQ of 146, had a reading disparity 
score of 19 and an arithmetic disparity score of 18; 
the other a boy, with an IQ of 125, had a reading 
disparity score of 36 and an arithmetic disparity 
score of 28. 
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training it might sample suggests that many 
items could be interpreted as related to the 
mother’s need to maintain interpersonal dis- 
tance in her relationship with her child. As 
noted earlier, Kurtz and Swenson (1951) 
found evidence suggesting that parents of 
underachievers might be more distant in their 
relationships with their children than parents 
of overachievers. d’Heurle, Mellinger, and 
Haggard (1959) found both parental over- 
protectiveness and pressures for achievement 
to be positively associated with high achieve- 
ment. Taking all this evidence together the 
following hypothesis is suggested: mothers’ 
attitudes toward early independence training 
will differentially influence the child and his 
subsequent school achievement depending 
upon whether she maintains a close or distant 
interpersonal relationship with him. The 
same hypothesis also can be stated that 
maternal attitudes favoring early independ- 
ence will have different impact upon the child 
depending upon mother’s motivation for de- 
siring that independence. A possible theo- 
retical distinction between instrumental act 
independence and emotional independence is 
implied. 
The findings of this 
limited by the selective nai 
tion studied: the children’s superior intel- 
lectual ability and small numbers, their 
attendance in a good school system with 
excellent teachers, and their parents’ rela- 
tively high socioeconomic and cultural status. 
A further source of bias is evident in the 
differential return of the mail questionnaires. 
While it is fascinating that of 13 mothers 
who did not return their questionnaires, all 
13 were mothers of children judged by the 
investigators to be somewhat maladjusted, 
it is difficult to see how this bias might have 
influenced the direction of the findings. 


Rather, it seems to define them all the more 
clearly. 


investigation are 
ture of the situa- 
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SUMMARY 


The relationship between mothers’ atti- 
tudes toward independence training and 52 
first grade children's school achievement was 
studied. It was found that children whose 
mothers favored earlier independence train- 
ing made less adequate school progress in 
both reading and arithmetic relative to their 
intellectual ability than children whose 
mothers favored later independence training. 
Implications and limitations of the findings 
are discussed. 
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pii the onset of blindness, an individual 
mes aware that he can no longer execute 
certain sensory-motor responses. A reduction 
A repertoire of responses often leads to a 
3 of trauma. Loss of vision hinders ordi- 
ea a emolen and creates a special prob” 
the Pa a to physical objec: orog 
s phase of blindness, even eating 
a simple meal may become a complex task. 
hus, social adaptation and acceptance of a 
Physical disability are largely dependent 
Pon the acquisition of new skills and habits. 
aoa ng many physically disabled pecan 
Mite a blindness becomes an intoleral k 
of affairs and results in & temporary 

Or permanent regression of ego functions. 
The wish for the restoration of vision may 
oe an individual to rely on an earlier 
Ode of ego development where the bound- 
anes between reality and irreality lack ade- 
Quate distinction. Consistent with Piaget’s 
(1953) description of the «magico-phe- 
pp enalistic” state of ego development, 4 
ae person may place unusual aoe 
acte ttaculous cures and ascribe magica cl 7 
eristics to persons and objects that e 
elieves may somehow do away with his 
lindness, Hence, a regression of ego func- 
lons may hinder the learning of new and 


apron. n 
Ppropriate social habits. 
vas an attempt 


The present investigation W 
ure and extent 


o 
shed some light on the nat a the 
these magical beliefs and to determine © 
of personality 


Ds p pee 

Ychological characteristics 
dissertation 
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which are relevant to the process of social 
adaptation and acceptance of blindness. 

Previous authors have dealt with similar 
problems. In a theoretical paper, Barker, 
Wright, Meyerson, and Gonick (1953) de- 
scribed the psychological reactions of blind 
individuals in terms of Lewin’s (1951) model 
of behavior. These authors stated that blind- 
ness creates a new psychological region in 
which the locations of goals are unknown and 
behavioral routes are unfamiliar. A new 
psychological region, therefore, becomes a 
source of frustration, conflict, and anxiety. 

Blindness also creates a state of helpless- 
ness and dependency. According to Adorno, 
Frenkel-Brunswick, Levinson, and Sanford 
(1950), the “anti-democratic personality” 
has dependency needs that involve feelings 
of “doubt, uncertainty, and momentary lack 
of self-confidence.” They illustrated that 
dependency may be associated with “worry 
about the future, realization of impending 
danger, and feeling absolutely lost.” 

Applying psychoanalytic principles to her 
study, Burlingham (1941) reported the case 
histories of two blind children. After close 
examination, she concluded that the loss of 
sight seriously interfered with the function 
of the ego to test reality. These blind chil- 
dren frequently engaged in fantasies in which 
wishes were fulfilled and certain unpleasant 
aspects of reality were denied. 

Similarly, Deutsch (1940) who studied 28 
persons born blind noticed a readiness to 
give up reality and to escape into fantasy. 
A large proportion of these subjects believed 
a cure would come from a supernatural 
power. Thus, Deutsch concluded: 


‘This expectation that the cure would come from a 
supernatural power belongs partly to the realm of 
fantasy in which there is room for the fulfillment 


of all wishes (p. 124). 
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For the present study, a series of hy- 
potheses dealing with the relation between 
the psychological characteristics of the blind 
and social adaptation were formulated. 
Among them were the following : 


1. Social adaptation to a new psycho- 
logical region is largely enhanced by ego 
strength, low manifest anxiety, and a positive 
attitude towards blindness. 

2. Failure at social adaptation results in 
magical beliefs as to the power of medicine 
and religion in the treatment of illness and 
disability. 

3. An antidemocratic personality is associ- 
ated with failure in social adaptation. 

4. There are no personality differences 
between socially well-adjusted blind subjects 
and physically normal individuals, 


METHOD 
Subjects 


were 
ntrol Sroup for the 25 blind subjects 
adjustment to 
were 
intelligence, socio- 
kground, and religious affiliation, 


Scales 


Based on the research 
(1954), Fitting (1953), 


t Scale was 


employment, travel, 
communication, 


dressing Problems, busi- 


hygiene, 
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in a civic organization, maintains proper etiquette 
during meals, and buys his own clothes. 

Each blind subject was rated on the Social Ad- 
justment Scale by a social worker most familiar 
with him. A second rating was obtained by the 
investigator who interviewed each subject and 
members of the family. A reliability coefficient of 
correlation between the two sets of scores was .92. 
Applying the Spearman-Brown Prophecy Formula, 
the correlation was raised to 95. 

Personal adaptability, manifest anxiety, and atti- 
tudes towards blindness were measured by the 
Barron Ego Strength scale (1953), Taylor Manifest 
Anxiety Scale (1953), and the Fitting Attitudes 
towards Blindness Scale (1953), respectively. 

Two separate attitude scales were constructed to 
get at the confidence placed in medicine or religion 
to restore sight. Each scale consisted of statements 
which reflected attitudes from extremely negative 
to extremely positive. On the Religious Scale, for 
instance, the statement, “The healing of lepers as 
mentioned in the Bible is a fairy tale,” was rated 
negative towards religion; whereas, the statement, 
“A perfect spiritual faith would absolutely lift us 
from all physical disease,” was considered extremely 
positive. For each of the two scales, 22 statements 
were selected by Thurstone’s (1951) method from 
original lists of 130; each was rated on an 11-point 
scale. 

Both instruments were administered to 80 under- 
graduate college students. The split-half reliability 
coefficient of correlation was .80 for the Medical 
Scale, .93 for the Religious Scale. The Spearman- 
Brown Prophecy Formula raised the coefficients to 
92 and .96. 

The California F Scale, Form 78, devised as a 
test of dependency and reliance on authority was 
used to measure “antidemocratic personality.” 

An estimate of intelligence was obtained from the 
Vocabulary subtest of the Wechsler-Bellevue scale, 
Form 1 (1944), 

At the end of the psychological test battery, an 
interview supplemented the data obtained from the 
Medical and Religious Scales. During the interviews 
an additional attempt was made to determine the 
attitudes of blind subjects towards “miraculous” 


cures by allowing the subjects to verbalize their 
Problems, 


Procedure 


The social worker most familiar with the subject 
Was asked to rate him on the scale of social adjust- 
ment to blindness, The ratings of each social worker 
Were corroborated by similar ratings obtained from 
a member of the family and a close acquaintance- 
Because the scale dealt with behavioral items, there 
yee little disagreement, Scores could range from 
mace higher scores indicating superior 
conte meen of test Presentation was always 
fab est items were read aloud to each blin 
oe and the answers tape recorded. The inter- 

W was recorded verbatim and then rated by tw? 
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ate judges along a five-point scale of 
Onfidence the subject ascribed to medicine and 


religi i 4 
eligion for bringing about a miraculous cure. A 


ee of 1 indicated a strong confidence and a 

follow of 5, no confidence. The same procedure was 
wed with physically normal subjects except that 
Y were not rated for social adaptability. 


RESULTS 


oe the Social Adjustment Scale, the scores 
of tg 52 blind subjects ranged from a low 
to a high of 29. The mean score for 
of entire sample was 16.3. The distribution 
Scores approximated a normal probability 
Curve, 
a Table 1 indicates, all psychological 
wit sures correlated significantly (.01 level) 
ith scores on the Social Adjustment Scale. 
Mese correlations support the hypotheses 
T higher ego strength, lower manifest 
nxiety, and a more positive attitude towards 
a tess are important psychological vari- 
thang which are strongly related to a blind 
oes ability to make an adequate 
cecal adjustment. Reliance on the power of 
he ‘cine and religion to restore vision was 
meee correlated with the blind persons 
ta of social adaptation. Antidemocratic 
soei ality was also negatively related P 
Do: ial adjustment to blindness. When the 
Ssible effects of intelligence and duration 
lindness were partialed out, all correla- 
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TABLE 1 


CORRELATIONS BETWEEN Scores ON SOCIAL ADJUST- 
MENT SCALE AND PSYCHOMETRIC TESTS 


Score r* 

Ego Strength Scale TÌ 
Manifest Anxiety Scale —.64 
Attitude towards Blindness .53 
Attitude towards Medicine —.61 
Attitude towards Religion —.60 
—.62 


California F Scale 


* All significant at .01 level. 


with the Medical Scale for which the con- 
fidence level dropped to .03. 

A comparison of scores on each of the 
psychological tests indicated that no sig- 
nificant differences existed between the 25 
physically normal individuals and the 25 
blind subjects who scored above average on 
the Social Adjustment Scale. 

As can be seen in Table 2, blind subjects 
who scored below the mean on the Social 
Adjustment Scale were frequently rated from 
their interviews as individuals who rejected 
their blindness and believed in miraculous 
cures. Better adjusted blind subjects tended 
to accept their blindness and rejected the 
idea of a miraculous cure. Physically normal 
subjects tended to be neutral. 

The distribution of scores for both blind 


nS i oni 01 level 1 ; 
with a p a ai groups as indicated in Table 2 shows the 
xception 
TABLE 2 


AND RELIGION BASED ON INTERVIEW PROTOCOLS 


RATINGS or THE ATTITUDES TOWARDS MEDICINE 
— = Religi 
Medicine eligion 
1 2 3 4 5 
Group N 1 2 3 4 5 
8% 63% 10% 3% 6% 18% 
Poor] 0% 18% 18% 
eA 27 32% 32% o 
Adjusted p 
lind i? 1 i 
Welt Of, 1% 0% 11% ae 3% % 1% 4% 86% 
+ 25 7 To 
Adjusted 70 
inq 246 28%, 286 g 
Visually 25 20% 20% 36% 12% 2% 4% +8% % % 12% 
ormal /o 
Subjects 
Negative Towards). Values shown are percentages of ratings by 


(Positive Towards) to 5 ( 


two No 
Vo iuggey Scale values run from 1 
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divergence of observed from expected results 
was significant at the .01 level. For the 
physically normal subjects, the divergence of 
observed from expected results was not 
significant at the .05 level. 

To test for similarity among psychological 
tests, a series of intercorrelations were ob- 
tained. These intercorrelations ranged from 
.04 to .59, with the mean correlation at 30. 
The intercorrelations which fell above the 
mean were: F Scale and Medical Scale Ns 
F Scale and Religious Scale .57; and Medical 
Scale with Religious Scale .38, 


Discussion 


An examination of the 
shows that social adjustment 
closely related to ego stre 
anxiety, and attitudes towards blindness, To 
a large extent, these psychological variables 
appear to determine the kind of personal 
and social adjustment an individual makes 
to his physical disability. On the other hand, 
the Psychological characteristics of an indi- 
vidual may be a reflecton of the amount of 
social adjustment he has made towards his 


results clearly 
to blindness is 
ngth, manifest 


lon of an individual toa 
n. 


manifest a 
developme 
to abolish 
and thus r 
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other human beings. The person may attempt 
to withdraw from the “struggle” and avoid 
competition against unfavorable odds. 

The individual who perceives the world in 
a less threatening sense generally has a more 
favorable attitude towards blindness. The 
physical disability does not place him in a 
dangerous position; hence, there is no need 
to withdraw from social situations. He ac- 
quires new skills and habits which permit 
him to carry out many of his personal and 
social functions. 

As mentioned at the beginning of the 
study, a blind individual may seek an escape 
route by which he avoids the unpleasant 
aspects of the new psychological region. A 
common escape route, for many of these 
individuals, takes the form of magical beliefs 
towards the recovery of vision. The results 
demonstrate that subjects who resisted the 
acceptance of blindness placed unusual con- 
fidence in the ability of medicine and religion 
to perform miraculous cures. Subjects who 
accepted their physical disability expressed 
more realistic attitudes towards medicine and 
religion. 

Finally, attitudes towards authority have 
an important bearing on the adjustment 
process to blindness. Making an inference 
from the work of Adorno et al. (1950), it 15 
probable that the blind individual who has 
an antidemocratic personality finds the new 
psychological environment full of dangerous 
elements. The overwhelming threat may 
cause him to lose self-confidence and depend 
heavily on the support of authority as ks 
means of survival. The less authoritarian- 
minded person probably does not perceive 
the environment quite as threatening. Since 
extremely dangerous elements are not pres- 
ent, he is able to maintain self-confidence 
and achieve levels of independent behavio™ 

In the area of magical thinking, intel- 
ligence had some influence on the formatio” 
of attitudes towards miraculous cures 1” 
medicine. However, intelligence had almos” 
no effect on attitudes towards miraculos 
cures in religion. 

Many intelligent blind individuals event” 
ally adopt realistic attitudes towards me“ 
cine. However, if acceptance of blindne® 


does not accompany a realistic attitu 
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towards medicine, these individuals may dis- 
cover new areas into which they project their 
magical beliefs. Having failed to regain 
vision, they may shift their magical beliefs 
from ophthalmology to religion or to some 
Other frame of reference through which they 
hope to experience a miraculous cure. 

A comparison of the F Scale with both 
the Medical and Religious Scales revealed 
comparatively high correlations. These cor- 
relations suggest the presence of a common 
Psychological factor. The three scales un- 
doubtedly measure personal reactions to au- 
thority. Medicine and religion could be con- 
Sidered as specific domains of authority; 
Whereas, items on the F Scale pertain to au- 
thority in a general sense. Thus, attitudes 
towards authority in the broad sense may be 
identical with attitudes towards authority in 
Specific areas. 


SUMMARY 


Fifty-two blind subjects were rated on the 


Ocial Adjustment Scale and then tested for 
SO strength, manifest anxiety, attitudes 
towards blindness, attitudes towards the 
efficacy of medicine and religion to restore 
health, and degrees of antidemocratic per- 
Sonality, Each blind subject was interviewed 


SO that additional data could be aitame 
about attitudes towards a miraculous cure a 
Indness, ‘Twenty-five physically norma 


Subjects were matched with 25 blind subjects 


Who scored highest on the Social Adjustment 
cale. The control subjects followed the same 
Procedure as that of the blind subjects but 
Were not rated for social adaptability 

. The results indicated that social adapta- 
‘on to blindness was related to high ego 
Strength, low manifest anxiety, and e poaae 
attitude towards blindness. Blind subjects 
Who were socially maladjusted placed un- 
Usual confidence in the efficacy of aon 
Nd religion to restore health and strongly 
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believed they would regain vision through a 
miraculous cure. Subjects who were poorly 
adjusted to the environment were charac- 
terized as more antidemocratic in their per- 
sonalities than blind subjects who had made a 
good social adjustment to their handicap. 

No significant differences were obtained 
between socially adapted blind and physi- 
cally normal subjects. 
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In the comparatively brief history of sys- 
tematic research in psychotherapy, verbal 
measures have been given a great deal of 
attention. Much of this work has taken the 
form of developing systems of categories 
for classifying manifest content. A survey 
of this literature by Auld and Murray (1955) 
lists 99 content analysis studies, all of them 
dealing in one way or another with what 
people say. 

Although it is generally recognized that 
how people talk is also a significant aspect 
of their speaking behavior, comparatively 
little attention has been devoted to the 


formal, structural, and expressive aspects of 
speech. The major variables which have been 
Systematically studie 


(Goldman-Eisler, 19 


Smith, 1957 
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exclusion; hence the speech disturbance data 
in this paper will refer only to categories 
b-h, which Mahl terms the “non-ah” disturb- 
ances. 

These speech disturbances (hereinafter 
referred to as SDs) are scored by listening 
to recordings of speech and marking each 
disturbance on a typewritten transcript at 
the precise place in the text where it occurs. 
The speech disturbance ratio (SDR) for 
any selected passage can then be calculated 
by dividing the number of speech disturb- 
ances by the number of words spoken. 

Mahl hypothesized that SDR varies di- 
rectly with fluctuations in the speaker’s 
anxiety level. As a test of this hypothesis he 
compared SDRs with his clinical judgments 
of anxiety in six therapeutic interviews with 
a patient he had treated 214 years previously: 
Contamination was avoided by having all 
SDs edited out of the typewritten tran- 
scripts from which the clinical judgments 
were made. His central finding was that the 
mean SDR was significantly higher for those 
phases of the interviews he judged “high 
anxious” than for those phases he judged 
“low anxious.” 

At the time Mahl’s study was published 
the present authors were trying to define and 
measure some aspects of patients’ speech 
which might reflect the immediate effects of 
Specific therapeutic interventions. The SDR 
appeared to be a potentially useful research 
instrument for this purpose, provided that 
certain issues could be clarified. These issues 
are embodied in the following hypotheses 
which were the basis of this replication and 
extension of the Mahl study: 


Ay. Hig, 
judged b 
SDR 


h anxiety interview phases, a 
Y the therapist, have a greater med! 
than low anxiety interview phases. 
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This, of course, was Mahl’s fundamental 
finding. Our aim was straightforward replica- 
tion: could this effect be successfully demon- 
Strated again using different patients and 
therapist-judges? 


_ Hə. High anxiety interview phases, as 
judged by persons other than the therapist, 
have a greater mean SDR than low anxiety 
Mlerview phases. 


This hypothesis formalizes the question as 
to whether the judgment of anxiety requires 
the intimate, detailed, firsthand knowledge 
Of the patient possessed only by the therapist. 

his question cannot be answered from 
Mahl’s results since he was both therapist 
and sole judge. 


_ Hs. Independent judgments by different 
judges demonstrate substantial agreement as 
to the identification of phases of high and 
low anxiety, 


This is a test of the interjudge reliability 
of the criterion judgment of anxiety. It also 
tests one of the links in the logical chain 
Which relates the SDR to anxiety, since the 
statement that A is a measure of B requires 
that values for both A and B can be deter- 
mined with acceptable reliability and that 
they covary systematically. Mahl’s study 
demonstrates only that the SDR is @ reliable 
Measure and that it covaries with his assess- 
Ment of anxiety. The present study has 
Sought evidence concerning the missing term, 
that is, the reliability with which fluctua- 
tions of anxiety within an interview can be 
Judged. 


METHOD 


o! Subjects, The subjects for this 
Utpatients in a psychotherapy r 
ts, Alpha had been seen for apl 

en 
ar by one of the authors. Mrs. Beta had be 
en for approximately 50 hours 

author, All of their interviews had been tape 

r interviews were 

d verbatim for 

were trained t 
scoring for this 


A d 
Study, The reliability of their scoring appien: ete 


to 


e judgments of anxiety were ma! 


edited transcripts. 
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Anxiety Judgments. These judgments were made 
according to the procedure described by Mahl in 
his report and in personal discussions with the 
authors. The preparation for the judging task began 
with an extended clinical review of the case. The 
judges then listened to and discussed a selection of 
the recorded interviews, including those interviews 
immediately preceding and following the test inter- 
views, but excluding, of course, the test interviews 
themselves. 

After this preparation, which required about 15 
hours, the judges independently made the clinical 
judgments of anxiety on the test interviews. The 
task was to divide each interview into a series of 
phases, representing periods of relatively high or 
relatively low anxiety for the patient. In describing 


this judging k W (1956) says: 


During theray ic sessions and while studying 
recordings it often appears that interviews are di- 
visible into “natural” segments or phases, each 
of which could be assigned to a single theme of 
content or interaction, and that the patient be- 
comes anxious and conflictful in some, but 
becomes less anxious in others (p. 6). 


The judging procedure, as described by Mahl, 
requires what can best be termed “immersion” in 
the interview material. The judges read and reread 
the typescripts at odd moments over a period of 
several weeks, making notes, revising, and trying 
out tentative approaches until the emerging phases 
seemed to stabilize. Finally these phases were marked 
off in the transcript and labeled “high” or “low” 
anxiety. The clinical preparation and judging were 
carried through to completion for Mrs, Alpha 
before beginning with Mrs. Beta. 

The judges were the authors, who judged both 
cases, and three additional volunteers, two of whom 
judged Mrs. Beta’s interviews and one of whom 
judged Mrs. Alpha’s. All five judges were trained 
psychotherapists with 5 to 10 years’ experience. 


RESULTS 


The data to be presented in this section 
will be organized in terms of the three hy- 
potheses listed in the introduction. 

Hy. Only the phase judgments of the 
authors on their respective patients were 
relevant to this hypothesis. The data analysis 
was directly adopted from Mahl. For each 
phase of each interview a SDR was com- 
puted, and mean SDRs calculated for the 
high anxiety and low anxiety phases for each 


patient.” 

2For individual interviews the number of high 
and low phases were not always equal. In order to 
avoid the bias of the “hour effect” described by 
Mahl (1956, p. 7) the excess number of high or low 
phases were randomly discarded from those inter- 
views in which the numbers were not equal. 
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The results of this test are equivocal. In 
the case of Mrs. Beta the high anxiety phases 
established by her therapist had a signifi- 
cantly higher mean SDR than did the low 
anxiety phases (¢ = 2.29; p< .05). In the 
case of Mrs. Alpha there was no significant 
difference between these means. 

Hə. In the case of Mrs. Alpha, it will be 
recalled, two judges besides her therapist 
judged the interviews; for Mrs. Beta there 
were three additional judges. The data 
analysis was the same as that for H,. This 
hypothesis was not supported. None of the 
five sets of judgments by the nontherapists 
showed the predicted SDR discrepancy. 

H;. It must be stated at the outset that 
this hypothesis could not be directly tested 
without altering the design such that it would 
no longer constitute a replication. The in- 
herent statistical obstacle is that the free 
judgment instructions require the judges to 
establish their own units, i.e., motivational 
Phases, which they are to label high or low 
anxiety. Since judges can and do differ with 
one another regarding the number and limits 
of the phases they discern in a given inter- 
view, there are no common units on which 
to base a quantitative statement about their 
agreement, 

0 comparison units, minutes 
Say, would serve the Purpose. Such minute. 
units could not defensibly be regarded as 
independent events for Statistical analysis 


Judgments had been 


@ m 2 to 20 con: 

i secutive 

one a direct test was impossible, a 
el nod was devised to permit a statement 


on some pas- 
ee on others. (b) If the 


Donald S. Boomer and D. Wells Goodrich 


where it exists, does not represent agree- 
ment-about-anxiety, or else the SDR does 
not measure anxiety. 

The statistical procedure was as follows: 
Transcripts of the eight test interviews were 
divided into minutes. These minute-units 
were then consecutively tabulated in terms 
of their labeling as high or low anxiety by 
all judges. Consensual “phases” were con- 
structed by selecting all sequences of 2 or 
more consecutive minutes in which all judges, 
or all judges but one agreed. It was possible 
thus to construct eight high and eight low 
consensual phases for each patient. These 
constructed phases were comparable in length 
to the phases which had emerged from the 
individual judgments since they averaged 6 
minutes with a range of from 2 to 19 min- 
utes. In the aggregate, these phases were 4 
large sample, representing more than 50% 
of the interview material. , 

Analysis revealed that for neither patient 
did the high anxiety and low anxiety con- 
sensual phases differ with regard to mean 
SDR. In terms of the argument presented 
above, these data furnish no support for the 
hypothesis that consensus among judges 
represents consensus about anxiety, if anxiel) 
is presumed to be accompanied by high 
SDRs. Further elaboration of this point will 
be reserved for the discussion section. 


DISCUSSION 


The results of this study can be sum- 
marized as follows: for one of the two pê” 
tients Mahl’s finding was successfully repli- 
cated, ie., her therapist’s judgments © 
periods of high and low anxiety were posi 
tively and significantly related to the spk 
during these periods. Anxiety judgments 0 
the same interview material made by te 
additional judges failed, however, to sho“ 
any significant relationship to the SDR. A 

For a second patient her therapist’s an* 
iety judgments and those of two addition? 
judges uniformly failed to show any rela 
tionship to SDR. Finally, those sections a 
both patients? interviews identified by t”; 
Majority of the judges as “more anxious . 
showed no higher SDR than those sectio” 
consensually labeled “less anxious.” 0 

These findings do not lend themselves * 
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any clear-cut conclusions about the SDR. 
The following issues, however, have been 
clarified: 


1. The Mahl hypothesis was supported in 
one of the cases, suggesting that his results 
have a degree of generalizability which war- 
rants further research. 

2. In the other case the findings were nega- 
tive. This indicates that the SDR cannot be 
Uncritically accepted as a universal measure 
Of intercurrent anxiety in psychotherapy ans 
terviews, Further investigation is necessary In 
order to specify the conditions under which 
the SDR can be meaningfully employed. 

3. The hypothesized relationship between 
SDR and anxiety judged from transcripts 
holds, when it does hold, only for the judg- 
Ments made by the therapist who treated 
the patient, This was true of Mahl’s patient 
and of Mrs. Beta in the present study. Thus 
Mahl’s criterion, the therapists’ judgment, 
remains essentially private and unrepeatable. 
es of this state 


The unfortunate consequenc 
present study. 


of affairs can be seen in the dy 
n seeking to validate a new measure e 
against a criterion of unknown rei- 
ability only positive results are informative. 
‘egative findings leave the issues ad 
Ounded. The failure of replication with A rs. 
Pha, for example, may be interpretec a 
Severa] equally likely ways. It is possible 


1 

i her anxiety fluctuations are yt es 
oe J er anxiety fluc c 

SDR, or that h n transcript, 


àre not reflected in her written i 
g that these particular judges are pe aty 
ciently sensitive to discern her anxiety 
"ppations. i 

€se, of rse, are not all 0 
iiteraria ‘but they suffice to make : 
‘Miliar point that one of the ends l 
Measurement hypothesis requires w E 
“choring, Mrs. Alpha’s data, being eo 
OW that the SDR hypothesis =. ss 
ange being equivocal, they Profinement. 
tucia] evidence for any specific Tet 


f the possible 
the 


r 
AS 
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Pup 

urthey Besench l m 
nea, ©. foregoing discussion Lear ba 
ee or more specification gn on 
Aehetties and the limitations ° tes ote 
a illustrated were the shortcoming; 
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research use of clinical judgment of anxiety 
fluctuations in providing the necessary cru- 
cial evidence. No simple solution can be 
suggested for this problem. No highly reli- 
able measure of anxiety can be offered as a 
substitute for clinical judgment, since no 
such measure exists. At our present stage of 
knowledge the phenomena associated with 
anxiety can be accounted for only by a 
complex description which involves observa- 
tions and inferences about physiological 
activity, private experience, and observable 
behavior, both verbal and nonverbal. The 
complex interrelationships among this loose 
network of phenomena are not now suffi- 
ciently explicit to warrant the belief that 
anxiety can be scaled along any single 
dimension or gauged at any one level of 
functioning." 

In the face of such theoretical complexity 
the preferred research strategy might be to 
regard the SDR as a promising measure of 
certain aspects of anxiety in certain classes 
of people under certain conditions. Experi- 
mental work, foregoing the attempt to 
demonstrate that the SDR measures anxiety, 
could focus on some limited prior questions 
concerning the psychological properties of 
the measure and the manner and conditions 
of its covariance with other reproducible and 
reliable measures which may also be pre- 
sumed to reflect some aspects of anxiety. 

Some research along these lines has al- 
ready begun. Panek and Martin (1959) 
building on Mahl’s work, have demonstrated 
with a group of psychotherapy patients that 
GSR dips are preceded by rising SDR gradi- 
ents and followed by declining gradients. 
Dittmann* is studying some temporal rela- 
tionships between SDR and certain body 
movements. The present authors are investi- 
gating possible relationships between SDR 
and rate of speech. The strategic advantage 
of this part-problem approach to anxiety 


measurement is that the use of reproducible 


8Dibner (1958) intercorrelated five measures of 
anxiety: skin conductance, patients’ self-ratings, 
clinicians’ ratings, and two separate measures of 
speech disturbance. Of the 10 correlations thus 
generated, § were not significantly different from 


zero. — | 
4 Personal communication, 1959. 
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methods and measures makes it possible 
ultimately to integrate these and other similar 
studies experimentally and conceptually. 


SUMMARY 


This research was an attempt to repeat a 
pioneering study by Mahl (1956), in which 
he demonstrated that the incidence of certain 
disturbances of speech increased during por- 
tions of psychotherapy interviews judged to 
be anxious, and decreased during portions 
judged less anxious. 

The results of the present test were in- 
conclusive. The anxiety judgments for one 
patient made by her therapist supported 
Mahl'’s finding, but in a second case the 
judgments made by the therapist failed of 
replication. The judgments made by five 
additional judges who were not the patients’ 
therapists uniformly failed to show the hy- 
pothesized relationship to the speech 
turbance measures. 
results must await f 


dis- 
Reconciliation of these 
urther research. 
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NORMAL, HYPNOTICALLY INDUCED, AND FEIGNED 
ANXIETY AS REFLECTED IN AND DETECTED 
BY THE MMPI’ 


ALBERT A. BRANCA AND EDWARD E. PODOLNICK ° 
University of Delaware 


The use of hypnosis as a technique for 
the production of signs of disorder in normal 
People has been attempted. Luria (1932), 
and Huston, Shakow, and Erickson (1934) 
Showed that word association techniques to- 
Sether with certain motor responses were 
Successful in revealing the presence of emo- 
tion arousing conflicts that had been sus- 
8eted in hypnosis. Fisher and Marrow (1934) 
reported significant differences in reaction 
times obtained in hypnotically | induced 
Moods” of elation and depression. Sweetland 
Node) suggested certain psychiatric syn- 

tomes to normal subjects who had been 
hypnotized. Comparison of MMPI profiles 
Obtained when these syndromes were sug- 
Bested indicated that it was possible to Pro 
uce “laboratory neuroses” by hypnos 


Grosz and Levitt (1959) suggested anxiety to 
sing students. 


Th hypnotized medical and nur al 
M €Y reported increased scores on the Tay 
anifest Anxiety Scale and diminished scores 
°n the Barron Ego Strength scale. They also 
"ePorted that scores on the two tests taken 
soring the waking state did not differ from 
oe obtained during hypnotic states when 
xiety was not suggested. h 
tudies also show that the MMPI aay 


Scales can identify dissemblers. 

{1947) treet Tet i MMPI was able to 
identify “fakers” even when they Ta 
Psychiatrists, clinical psychologists, ee 
aa workers who were familiar we ell 
as ostie signs of behavior disorders as a 
Ch the MMPI. Other investigator g) ate 


“hance, & Judson, 1949; Hunt, 
ian that the "MMPI, through separate 
ted by 4 University of 


Ty: 
This research was suppor t 
esearch Grant- 


De 
‘laware Faculty Summer R 


or combined use of its validity scores, is 
capable of differentiating between dis- 
semblers and other groups. 

In 1952 Welsh added an Anxiety (A) 
scale to the MMPI. This development has 
made it possible to observe the effects of 
suggesting this simpler and more general 
symptom of disorder. 

The specific hypotheses of this experiment 


are: 


1. There will be a significant increase in 
the A scale of the MMPI between scores 
obtained under normal conditions and under 
conditions of hypnotically induced anxiety. 

2. The validity scales of the MMPI will 
differentiate between profiles obtained under 
conditions of dissembling and profiles ob- 
tained under conditions of both normal and 
hypnotically induced anxiety. 


METHOD 


Subjects 

Ten students, of whom eight were female, were 
used as subjects in this experiment. The normal 
records were obtained from students who had taken 
the MMPI as part of a classroom demonstration. 
At the time the first profiles were obtained, the 
students were not aware that they might be called 
upon to participate in an experiment. Students 
from other freshman and sophomore courses volun- 
teered to participate when they had heard about 
the study. Two of these students were used as ex- 
perimental subjects. These two were given to believe 
that the MMPI was being used as a screening 
device and not a part of the experiment proper. 

Experimental candidates were selected on the basis 
of: (a) normal MMPI profiles and anxiety scores; 
(b) absence of a history of treatment for mental 
disorder; (c) absence of a history of epilepsy, or 
convulsions, Or neurological disease of any type; 
(d) a willingness to participate in the experiment. 
Actual subjects were selected from this larger group 
on the basis of hypnotizability. The criterion for 


ow at Bucknell University- 465 
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depth of trance was the elicitation of positive audi- 
tory and visual hallucinations. Out of a total of 50 
experimental candidates, 10 met this Criterion, 2 
males and 8 females. This percentage is consistent 
with others also reporting approximately 20% suc- 
cess in obtaining a deep trance (Dorcus, 1956). 
A disproportionately large number of females volun- 
teered to participate in the experiment. 


Procedure 


Each hypnotic session was held in a room with 
a one-way observation screen and an intercom 
system. In this way one experimenter was able to 
observe each session while the other performed the 
hypnosis. Each candidate was made aware of this 
observation, 

The first phase of the experiment consisted of 
training sessions wherein the experimental candidates 
were trained to achieve the trance state. When a 
depth of trance was reached in which positive 
auditory and visual hallucinations were produced, 
the candidate met the criterion for inclusion as 
an experimental subject and an anxiety state was 
suggested. The instructions for producing anxiety 
were obtained from definitions and descriptions of 
anxiety by various authors (Conklin, 1936; Heyns, 
1958; Lehner & Kube, 1955; May, 1950; Shaffer & 


Shoben, 1956; Warren, 1934), These instructions 
were as follows: 


You are beginning to feel very uneasy and 
anxious. You don't know why, but this uneasy 
feeling is making you nervous, irritable, and 
frightened. You feel as if something dreadful is 
about to happen but you don’t know what. This 
feeling of dread is mingled with a curious feeling 
of hope that is very unpleasant. You are be- 
coming more and more apprehensive. You are in 
a state of anxious expectation and self-doubt. You 
feel now as if you are threatened and it frightens 
you, You feel as if you are about to lose some- 
thing important to you, or be hurt. This anxiety 
is becoming stronger and stronger. Now you feel 
as if something is Wrong, as though you had 
neglected to do something very important, but 
you can’t recall what it is. You feel, though, 
that whatever it is, it is making you feel on edge 
and uneasy. It is making you feel blue, melan- 
choly, unhappy, and excited in an unpleasant 
way. You feel frightened, but you don’t know 
what it is you are frightened about. This is 
certainly an unpleasant form of excitement, You 
are now very apprehensive and anxious, 


After the anxiety instructions were read, the 
MMPI was readministered. The subjects took be- 
tween 70 and 90 minutes to complete the MMPI, 
In order to maintain the trance state for that period 
instructions and suggestions reinforcing the trance 
state were given when the subject had reached the 
halfway point in the test. At that time the anxiety 
instructions were also reread to each subject. The 
subjects were aroused from the trance state after 
suggestions counteracting the anxiety were made. 
These instructions, given twice, were as follows: 
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You are beginning to feel less apprehensive and 
anxious. The unpleasant form of excitement caused 
by the fact that you were frightened is leaving 
you. You are beginning to feel happier, more 
alert, and relaxed. You no longer feel on edge or 
uneasy, and you are experiencing a feeling of 
well-being. You are now confident and at ease. 
You feel happy and at peace with the world. 
You are experiencing a soothing calmness and 
you feel warm, relaxed, comfortable, and alert. 
You don't feel nervous, irritable, or frightened 
any more. You are no longer apprehensive and no 
longer feel self-doubt. You don't feel as if you 
are frightened or are about to be hurt, You don’t 
feel as if you're about to lose something but 
don’t know what. You are now very relaxed. You 
feel as if all of your troubles and problems are 
leaving you. You feel as if all of your fears are 
gone and this gives you a feeling of ease and 
comfort. You are happy and relaxed and normal. 


At the hypnotic session, subjects were instructed 
to remember all events that occurred during the 
trance state. 

In the final phase of the experiment, which took 
place approximately a week after the first phase; 
each subject was told that he was to make believe 
that he was anxious and that he was to ‘fake’ 
anxiety while taking the MMPI. The same descrip- 
tion of the anxiety state was read to him again 
with the statements “make believe that” or “pretend 
that” prefixing each sentence. He was further in- 
structed to mark the test as though he were trying 


TABLE 1 
MMPI T Score Means AND STANDARD DEVIATIONS 
UNDER NORMAL CONDITIONS, UNDER Conpitions OF 
HYPNOTICALLY Inpucep A ETY, AND UNDER 
CONDITIONS oF Dis. =MBLING 


se HIA a 

Condition Condition Condition 

Scale M SD M sp m sD 
a 14184 3 o 2 2 
L 79 778 463 662 424 481 
is 31 29 60 383 250 11.87 
K 58.1 849 513 879 429 8.99 
Hs 52.4 7.28 503 516 708 15.6! 
D 47.8 541 561 1250 793 1731 
ly 562 773 549 g4 no 70 
Pa 53.8 10.90 590 13.26 go5 1449 
Mf 422 816 455 916 497 1024 
Pa 494 613 604 1085 873 2149 
Pi Š41 795 61.4 1277 854 149 
Se 55.2 648 653 10.87 988 19.29 
Ma 66 i030 637 tas 15 Ne 
Š 49.9 8.67 56.5 1242 722 13.17 
A J43 483 545 10.70 726 9% 
R VA 707 46.7 554 49.0 8-6 
e eee 


Note.—N = 10, 
* Based_on raw Scores, 
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TABLE 2 


ANALYSES OF SCORES UNDER NORMAL, Hy! 
DIssEMBLING CONDITIONS 


PNOTICALLY INDUCED ANXIETY, AND 


N and HIA HIA and D Nand D 
Scale r MD t MD t MD i 
he 3.31 —1.1 1.88 —.1 =i 

L 1.88 —1.6 1.14 —3.9 -5.5 384 
r» 29.39** 2.9 3.05** 19.0 21.9 3.50** 
K 7.55** —-6.8 4.37** —8.4 —15.2 6.15** 
Hs 11.80** —2.1 1.60 20.5 18.4 3 84** 
D 16.48** 8.2 2.21 23.2 31.5 5.89** 
Hy 12.08** —1.3 79 16.1 14.8 4.46** 
Pd 11.91" 5.2 1.07 21.5 26.7 4.20** 

Mf 1.66 3.3 2.36 4.2 7.5 1.99 
Pa 16.45** 11.0 2.78* 26.9 37.9 6.04** 
Pl 17.93** 7.3 2.17 24.0 31.3 7.21** 
Sc 29.44** 10.1 3.07* 33.5 43.6 7.13** 

Ma 2.51 34 1.45 7.8 10.9 2.47 
Si 9.80** 8.6 4.00** 15.7 22.3 4.42** 
4 27.34" 10.2 Asr" 18.1 28.3 10.90** 

R 0.27 —.1 .56 2.3 1.6 45 

condition listed were lower than those of the first. 


e scores for the second 


f variance for e: 


Note.—Minus signs indicate that th scond 
b palues of F obtained by analysis 0 ach sca 
* S} n raw scores. 

pe Si at at the .05 level. 
ignificant at the .O1 level. 


to create the test profile of a person suffering great 

anxiety, 

of a Ster to provide an additional subjective check 

me: e subjects’ emotional states during the experi- 

co nt, an anxiety rating scale was constructed ac- 
rding to the Likert technique (Edwards, 1957). 

sul Consisted of 40 questions about the way the 
ject felt at the time of responding to the scale. 


t included items such as: “I am at ease,” “I feel 
» «My morale is 


tense wi 

without any d reason 

lo ul any goo " A 3 

403) “T am restless and irritable now.” Each of its 
discriminate between 


~ items had been shown to 1 
w anxious groups. It has a 
.97. This scale 
as part of a 
given under 


spie parions and lo an: g 
Wag alf reliability coefficient of 
Clagett ministered with the MMPI 
Con, room demonstration. It was also £ 
ditions of hypnotically induced anxiety. 


RESULTS 

e data of this experiment consisted o 
ae MMPI profiles obtained under nas 
nae (N), conditions of hypnoticaly 
duced anxiety (HIA), and conditions K 
pe mbling (D), as well as scores OD ; e 
Xlety rating questionnai nder 


© first two conditions. T 
T scores 
. Both the f 
res. The scores 
ecessary 


Sc; 

ale are listed in Table 1 

i lés are given in raw SCO 
? were well enough below the n 


le under the three conditions of the experiment. 


30 that conversion to T scores would neces- 
sitate each raw score having the same 7, 
i.e. a T of 50. The scores on F were so high 
under conditions of dissembling that con- 
version to T scores tended to hide differ- 
ences between this condition and the other 
two. Because of the small V (N = 10), N—1 
was used in computing the standard devia- 
tions (Edwards, 1950). 

‘An analysis of variance was performed for 
each scale under the three conditions. A com- 
parison was then made between the T scores 
on each scale obtained under N and HIA 
conditions, N and D conditions, and HIA 
and D conditions. A é test was employed 
for this purpose. Because the scores obtained 
under these three conditions were not random 
with respect to each other, a ¢ comparing 
the differences between correlated means was 
computed using the differences between the 
scores (McNemar, 1949). The mean differ- 
ences and the ż’s for each scale for all combi- 
nations of the three conditions are listed in 
Table 2. Although all values of ¢ were re- 
ported they were marked as significant, in 
the conventional manner, only for those 


scales where significantly large Fs were 


obtained. 
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In comparing the scores between the N 
and HIA conditions, the F, K, Pa, Sc, Si, 
and A scales showed significant changes. In 
comparing the scores between the HIA and 
D conditions, only the ?, L, Mf, Ma, and R 
scales showed insignificant changes. Likewise, 
the differences between the scores for N and 
D conditions were insignificant for the L, 
Ma, ?, Mf, and R scales, being significant 
for all others. 

A # was also computed for the anxiety 
rating questionnaire. The scores on this 
questionnaire, given twice (N and HIA 
conditions) showed a mean change which 
is significant at the .01 level. 


Discussion 
Observation of Subjects 


During the HIA session, there were indica- 
tions of stress on the part of the subjects. 
When asked how they felt, they made com- 
ments such as: “I don’t feel good,” “I feel 
like I want to get out of here,” “I feel un- 
happy,” and “I feel as if something were 
wrong.” In addition to these comments, the 
subjects showed signs that the experimenters 
interpreted as discomfort and distress. These 
Signs were: furrowing of the brow, clenching 
of hands, frowning, tenseness, biting of lips, 
sighing deeply. One subject, a female, burst 
out crying while answering the MMPI items. 
The experimenter stopped the test and, see- 
ing that she could not continue because of 
excessive crying, read the alleviating instruc- 
tions. She stopped crying as the instructions 
were being read and agreed to complete the 
test the following day under the same experi- 
mental conditions. 

In general, the subjects concentrated on 
the test and appeared to be making an effort 
to read and answer the items carefully. All 
subjects appeared relieved when the allevi- 
ating instructions were read. This relief was 
evidenced by smiling, relaxation of facial 
muscles, restrained laughter, and remarks 
such as; “I feel good now.” 


The A Scale 


Hypothesis 1 stated that there would be 
a significant increase in the A scale between 
scores obtained under normal conditions and 
under conditions of hypnotically induced 
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anxiety. This hypothesis was supported by 
the data, the difference in the scores being 
significant at the .01 level. 

The A scale is made up of items which 
occur on several of the other scales. Cluster 
and factor analyses indicated that the scale 
was relatively homogeneous and seemed to 
be related to anxiety. The scale contains very 
few “obvious” items that deal directly with 
the word anxiety and its synonyms. However, 
the experimenters found seven such items 
that they considered obvious with respect to 
the “anxiety” instructions the subjects re- 
ceived: “I feel anxiety about something or 
someone almost all of the time,” “I must 
admit that I have at times been worried 
beyond reason over something that really did 
not matter,” “I worry quite a bit over pos- 
sible misfortunes,” “I brood a great deal,” 
“I wish I could be as happy as others seem 
to be,” “Most of the time I feel blue,” “I 
very seldom have spells of the blues.” A 
count was made of the number of times 
these seven items were chosen under the 
three conditions, N, HIA, and D. They were 
chosen a total of 13 times under N condi- 
tion, 34 times under HIA condition, and 63 
times under D condition. These differences 
were significant at the .01 level. A ¢ was then 
computed for the A scale with these seven 
items omitted to find if significant differ- 
ences would still be obtained without the 
obvious items. The removal of these items 
did not alter the degrees of Significance ob- 
tained previously with the full scale, This 
reduces the likelihood that the elevation of 
the A scale was the simple result of a height- 
ened and conscious intent to comply with 
the suggestions of the experimenters, 


The Validity Scales 


Hypothesis 2 stated that the validity scales 
would differentiate between profiles obtained 
under Conditions of dissembling and profiles 
obtained under conditions of both normal 
and hypnotically induced anxiety, This hy- 


Simulation index with a cutoff point of plus 
), all profiles from both the 
. =, > Conditions were in the normal 
Tange indicating valid profiles. In addition, 
the F scale alone identified all but one of the 


Anxiety in 


profiles obtained under conditions of dis- 
sembling. The F minus K index also did not 
identify this one profile, but identified all 
others. 

The significant changes in the F and K 
scales obtained from Condition N to Condi- 
tion HIA do indicate that a change in 
test-taking attitude occurred in the latter 
state. The K scores were significantly lower 
in Condition HIA as compared with Condi- 
tion N, indicating that the subjects became 
more critical of themselves. In addition, the 
F scores were significantly higher in the HIA 
condition, indicating that while in this state, 
the subjects answered more of these items 1n 
the direction away from the direction the 
normal standardization groups answered 
them. The experimenters feel, however, that 
this might be expected in that the HIA state 
represents a condition removed from the 
condition under which standardization was 
obtained. 

_It is interesting to note that, although ao 
Significantly so, Z scores were toneitenhy 
lower in the HIA and D states as compare 
to N. Perhaps the criticalness of the subjects, 
as evidenced by their lower K scores in the 

IA and D conditions, also made them more 
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The Diagnostic Scales 
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All diagnostic scales except Mj, M a 
Showed significant differences fron 


7 } ene les except Mf, 
conditions, All diagnostic ae differences 


a, and howed signi idi 
rom HIA 5 D conditions. Since the Po 
Scales identified dissembling profiles X these 
ition D, but not in Condition H ults in 
differences further support ines ie two 
indicating a real difference between 
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dissembling. The differences between profiles 
obtained in these two conditions suggest that 
hypnosis is not a state of mere heightened 
cooperation. 

In going from Conditions N to HIA, it 
might be expected that only the A scale 
would be increased, since the instructions 
under hynopsis were directed to this effect. 
However, the Pa, Sc, and Si scales were also 
significantly heightened. 

The rise in Pa may be considered as a 
direct result of the anxiety instructions. 
Looking post priori, it can be seen that sug- 
gestions such as “You are beginning to feel 
very uneasy and anxious,” and “You don’t 
know why, but this uneasy feeling is making 
you nervous, irritable, and frightened,” might 
easily be reflected in the items composing 
the Pa scale as these items were derived from 
patient samples “symptomatically . . . to 
have ideas of reference, to feel that they 
were persecuted by individuals or groups. 
_. 2? (Hathaway, 1956, pp. 109-110). 

The heightened Sc and Si scales can be 
considered in the same way. Feelings of ap- 
prehension and anxiety, as well as feelings 
of being blue and melancholy, might be re- 
flected in items dealing with social intro- 
version, and schizophrenic symptoms have 
been described in much the same way. The 
D scale, however, was not significantly in- 
creased, indicating that the part of the 
instructions relating to feelings of depression 
were not alone responsible for these changes. 

The change in scores in the Likert-type 
anxiety questionnaire from Conditions N to 
HIA were significant at the .01 level. The 
questionnaire was not give under Condition D 
because of its transparency. The change 
indicates that each subject’s subjective 
evaluation of the way he felt during the HIA 
session corresponded to the overt behavioral 
differences observed and to the detection of 
these differences effected by the A scale. Most 
of the scores doubled and some increased by 
as much as 100 out of a possible 160 points. 
Only 1 subject out of 10 failed to report an 
increase in anxiety as reflected in the ques- 
tionnaire. 

It is impossible to estimate the effect of 
the order of the experimental conditions upon 
test performance. The order used in this 
design was chosen so as to minimize con- 
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tamination of experimental conditions by 
prior experience. Obviously the “normal” 
administration of the MMPI which served 
as a control and also as the basis for select- 
ing subjects had to come first. It was felt 
that the faking conditions should be last in 
order to prevent the establishment of a faking 
set which might persist and intrude upon the 
hypnotically induced anxiety condition. 


SUMMARY AND CONCLUSIONS 


Ten college students took the MMPI 
under three conditions. In the first condition, 
the test was taken as part of a classroom 
demonstration. The second administration 
occured under conditions of-hypnotically in- 
duced anxiety. The third administration 
occurred in the waking state after instruc- 
tions were given to “fake” anxiety. 

Comparisons of the test profiles and the 
results of the specially constructed anxiety 
questionnaire permitted the following con- 
clusions to be drawn: 


2. Overt behavioral Signs indicate that af- 
fective changes are experienced when anxiety 
IS suggested to hypnotized subjects. 

3. The anxiety questionnaire revealed that 
marked increase of 
feelings of tension, discomfort, unpleasant- 


ness, and apprehension following the 
“anxiety” instructions in the hypnotized 
state over their reports in the normal 


waking state. 

4. The validity scales of the MMPI suc- 
cessfully identify 9 out of 10 dissemblers and 
show that, in a state of hypnotically induced 
anxiety, valid profiles are obtained. 

5. Significant differences in the diagnostic 
and validity scales between conditions of 
hypnotically induced anxiety and conditions 
of dissembling indicate that the former is dif- 
ferent enough from the latter to strongly sug- 
gest that hypnosis is not a state of mere 
exaggerated cooperation. 


Albert A. Branca and Edward E. Podolnick 
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THE EFFECT OF BRAIN DAMAGE UPON SPEED, 
ACCURACY, AND IMPROVEMENT IN VISUAL 
MOTOR FUNCTIONING 


KENNETH B, STEIN? 
Veterans Administration Regional Office, San Francisco 


ont Purpose of this study was to investi- 
in ye] ree aspects of visual motor functioning 
Were ation to cortical deficit. These aspects 
oto visual motor speed, accuracy in visual 
iy reproduction, and improvement. These 
a sia were observed simultaneously within 
tary t, e unitary time span and a single uni- 
these ask. The relative importance of each of 
relati modalities as well as their inter- 
tie were also studied. o 
and ~ anon (1945) and Klebanoff, Singer, 
Vieweq aasky (1954) have extensively re- 
Teveg studies of brain damage. The reviews 
eatu, that some of these three visual motor 
ex nres have been studied either implictly or 
Plicitly, 
thee Speed variable is dealt with in ape 
the W Some of the performance subtests 0 
to y AIS (Wechsler, 1955) lend themselves 
b Neasures or scores based on time or time 
foung rails, Almost all investigators hate 
$ Some of these visual motor subtests 
Digit et by cortical damage such as the 
and Symbol (Klebanoff et al., 1954; P: 13) 
Reita the Block Design (Aita, Armitage, 
ae Rabinowitz, 1947; = 1947, 
oldman, Greenblatt, & Coon, 7? 
Qa, blatt, Goldman, & Coon, 1946; Lidz, 


Wo & Ti ther example O 
Isy etze, 1942). Ano ellation 


ist fee speed tasks was the cane Lye 
i mai 
Vaty, “imiques used by Hunt —_ ma 


Dung” 1939) and Rylander (193 
“ach Teduced speed following 
of them also employed an 4 
e author wishes to thank. R. C. Tryon, ced 
>! California, Berkeley, a F Henderson 
members of the Veterans Administration 
Francisco, for their helpful suggestio 
al reading of the manuscript. 


lobotomy. 
ccuracy 


score. Hunt found that subjects demonstrated 
greater accuracy after lobotomy. Rylander 
did not find any significant difference in 
accuracy scores. 

As with tests for speed, there are specially 
devised techniques which focus upon the 
qualitative or accuracy aspect of functioning 
by the brain injured. The Bender-Gestalt 
(Bender, 1933) and the St. Louis Memory- 
for-Design (Graham & Kendall, 1946) Tests 
involve the reproduction of geometrical 
figures. In these tests it is often noted that 
subjects with cortical lesions tend to re- 
organize the figures qualitatively in the 
direction of simplification. Fragmentation, 
rotation, reversal, and greater closure and 
balance are some of the manifestations of 
simplification. These perceptual motor dis- 
turbances have been discussed in terms of 
Goldstein’s theory of concrete and abstract 
attitudes (Goldstein & Scheerer, 1941) as 
well as gestalt theory (Bender, 1933). 

One would expect that the concrete atti- 
tude of the organic would be revealed not 
only in its effect upon speed and accuracy 
but also upon improvement. Goldstein and 
Scheerer (1941) indicate that the organic 
does not profit as readily from experience as 
the normal. McFie and Piercy (1952) found 
impairment of retention and learning which 
was related to the size of the lesion. 

The main hypothesis for investigation in 
the present experiment was that organics 
differ significantly from nonorganics in speed, 
accuracy, and improvement in visual motor 
functioning. As an auxiliary problem an 
exploration was made of the interrelation- 
ships of these three variables as well as two 
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additional ones, IQ and age. Finally, each 
variable as well as the combination of these 
variables was assessed for power of discrimi- 
nation between organics and nonorganics. 


METHOD 
Subjects 


There were 60 subjects with cortical brain 
damage and 120 controls. All were white and 
American born. Of the 120 controls 15 were female 
and 105 male, whereas all the organics were male.? 
Forty-two of the experimental subjects were VA 
outpatients and the remaining 18 were hospitalized 
veterans. Diagnostically the organic subjects were 
distributed as follows: 27 posttraumatic encephalop- 
athies, 14 cortical insults associated with circula- 
tory disorders, and 8 preoperative and postoperative 
tumors. The remaining 11 experimental subjects 
had varying diagnoses such as paresis, encephalitis, 
tuberculous meningitis, and cortical atrophy. None 
of them was considered psychotic. Of the 120 con- 
trols, 51 were VA outpatients with nonpsychotic 
psychiatric diagnoses; 29, outpatients in treatment 
for tuberculosis; and 40, general medical outpatients. 
None of the experimental and control subjects had 
any noticeable visual defect or motor impairment 
of their writing hand. 

The 180 subjects were drawn from a larger pool 
of 261 subjects in order that the organics and 
controls could be equated for age, education, and 
TQ. Other clinical groups such as various psychotics 
will be used in a subsequent study and should 


provide a comparison with the results on the present 
groups. 


Procedures 


Vocabulary Subtest. The Vocabulary subtest of 
the Wechsler-Bellevue scale, Form I (Wechsler. 
1944) was used to obtain an estimate of verbal 
1Q. Morrow and Marks (1955) found no significant 
difference between brain injured and control sub- 
jects on vocabulary. This lack of difference led the 
authors to conclude that this measure was an ade- 
quate indicator of premorbid 1Q. Both Jackson 
(1955) and Rapaport (1945) report that the 
Vocabulary subtest tends to be relatively refractory 
to impairment. 

Symbol-Gestalt Test. It was expected that the 
organics with their concrete attitude would show 
significantly greater difficulty than nonorganics with 
both speed and accuracy as well as with improve- 
ment. As a means of tapping these three aspects 
of visual motor functioning, a symbol substitution 


2 As part of an earlier unpublished study the 
author found no significant differences for the sex 
variable on the Symbol-Gestalt procedure used in 
this experiment. The 22 females’ mean score = .687, 
SD .536; 22 males’ mean score = .606, SD 448, The 
t= .534 which is insignificant. Both groups were 
equated for age and IQ. 
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task was developed. The format is similar to the 
Digit Symbol subtest of the WAIS but new symbols 
have been devised ë (see Figure 1). These symbols 
were constructed to have poor gestalt form, 1€ 
they lack closure and balance. They have gaps an 
unequal as well as nonparallel lines. This visual 
motor task has a 3-minute time limit and the num- 
ber of substitutions made for cach minute was 
recorded by the experimenter. There are a total 
of 110 substitution items. The time limit prevents 
subjects from completing all items so that no one 
achieves a maximum score. 


Fic. 1. Symbol-Gestalt Test—symbols and numbers. 


This task yielded one speed, one improvement, 
and three accuracy measures which could be 
statistically compared for the experimental anC 
control groups. The speed score (3-minute complete 
was the total number of substitutions made in the 
3-minute time limit. The improvement measure was 
calculated by subtracting the total number of corre¢ 
symbols in the first minute from the total number 
in the third minute. The three accuracy measures 
were: (a) the total number of correct substitutions 
in 3 minutes (3-minute correct); (b) the perce? 
error (% error) based on the number of errors ” 
the first 40 substitutions; and (c) the number ° 
qualitative errors (Q error)—rotations, reversals 
wrong substitutions, and distortions, The instructio” 
for administration of the test to the subjects w 
similar to that of the Digit Symbol subtes 
(Wechsler, 1955). 


RESULTS 
Equality in Education, Age, and 1Ọ 


In the statistical treatment of the mea 
ures, ¢ tests were calculated. Table 1 revea 
first that the two groups were equated 
education, age, and IQ. Since these variable 
may be related to speed, accuracy, # | 
improvement independently of brain damas® 
it was necessary to hold them constant. TPY 
it was possible to assess the unique effec 


of organic injury upon the visual mot” 
functions, 


Differences between Organics and Controls 


The controls de 
greater speed th 
the 3-minute co 


$ y 
monstrated significant 
an the organics as show? jj- | 
mplete score, The mean | 


” Although this test was devised by the autho 


fr 
i d 
it was fi 3 inet 
form. mab Used. by Phelps (1952) in a mod 
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TABLE 1 


¢ VALUES For 120 CONTROLS AND 60 ORGANICS 


Means, STANDARD Deviations, RANGES, AND 


Me Mo Me—Mo  Rangce Rangeo SD. SDa t 
11.17 11.23 -06 5—16 5—16 2.66 2.78 -003 
39.91 40.42 py! 20—67 21—65 14.08 13.68 -233 
ae 114.39 113.27 1.12 91—133 92—139 9.0 10.05 -132 
“minute complete 59.02 42.03 16.99 25—105 17—90 16.72 14.19 127" 
-minute correct 50.08 29.92 20.16 21—94 4—59 15.20 13.18  9.186** 
error 14.95 29.88 14.93 0—55 0—85 12.55 17.53 5.888** 
Q error 63 1.62 99 0—15 0—13 1.59 2.56  2.729* 
Improvement 2.16 45 1.71 -5—+9 —9—+9 3.13 3.35 3.295** 
& P at 01 level is 2.645. 


ference of 16.99 symbols achieved a p at 

elow the .001 level (see Table 1). 

Of the three accuracy measures, the 3- 
minute correct score yielded the largest ¢ 
value of 9.19, which has a p also below the 
001 level. The controls completed 20.16 
More correct symbols than the brain damaged 
Subjects, 

The % error, the seco n 
ure, revealed that the controls committed 

as many errors as the experimental 
Stoup, resulting in a ¢ of 5.89 which is sig- 
nificant at Jess than the .001 level. The 
Organics had a wider range and greater 
Variability of % error scores. 

In the Q error, the third accuracy meas- 
ure, the organics produced almost three times 
S many such errors as the nonimpair 
hale The # of 2.73 is significant at less than 
wis 1 level, Although the controls showed a 
bs range of these errors (0-15) than 
s © organics (0-13), this was due to m 
Ntrol subject who was the only one wit 


nd accuracy meas- 


TA 


INTERCORRELATIONS For TOT: 


more than 5 such errors. Over 10% of the 
organics made 5-13 qualitative errors. 

Improvement, the last measure, showed 
the organics had improved very little whereas 
the controls had a mean improvement of 
more than two symbols. The ¢ value was 3.29 
which is significant at the .001 level. 


Relationship between Variables 


To ascertain a measure of amount of im- 
pairment of the three aspects of visual motor 
functioning, point biserial correlations were 
calculated between each score and the 
dichotomous criterion variable, i.e., organic— 
nonorganic. In order of size the correlations 
for the five scores were as follows: 3-minute 
correct .548, 3-minute complete .451, % error 
—.440, improvement .245, and Q error —.230. 
In addition age and IQ had near zero cor- 
relations, .018 and .056, respectively. 

Table 2 shows the intercorrelations for 
the seven variables on the entire sample. 
Although there were a number of significant 


BLE 2 


AL SAMPLE ON SEVEN VARIABLES 


3-Minute 3-Minute be 

Correct Error Q Error Improvement 
Variable IQ Complete orri % 
à —.585 267 182 — 147 
a -03 8 173 077 ‘008 a 
$mi i (895 —.196 =117 203 
3-mi ate complete — 581 "335 
oz “Mute correct 358 a 
sor 066 
eR -! 

Proven me 
ent 
cant at the .01 and .05 levels, respectively. 


N re signifi 
Ote.— Correlations of .193 and -146 a 
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TABLE 3 
INTERCORRELATIONS FOR CONTROL AND ORGANIC GROUPS on SEVEN VARIABLES 
3-Minute 3-Minute : . 
Variable IQ Complete Correct o Error Q Error Tezave 
M ` cr ~% r 

Č © ¢ o c o é o (l o Ç k r) 
Age 244 .187 —.602 —.620 —.680 —.708 235 .379 140 -250 —. 193 E 
10 -323 101 287 -066 O08 251 025 152 048 “a 
3-minute complete 874 S49 093 —.151 1072, —.149 069 55 
3-minute correct —374 —.616 —192 —.380 os “8 
458 .576 . Ud 
oo —.017 00 

Improvement — 
= a E ignificantat 
Note.—For controls .234 and .179 are significant at .01 and .05 levels, respectively; for organics .328 and .252 are significant 3 


-O1 and .05 levels, respecti 


ly. 


correlations, only a few were large enough 
to account for a sizeable amount of the 
variance. The speed score (3-minute com- 
plete) correlated .895 with one of the ac- 
curacy scores, 3-minute correct, but showed 
low correlations with the remaining two 
accuracy scores (with % error —.196, with 
Q error —.117). The speed score also had 
a high relationship of —.549 with age. Like- 
wise the 3-minute correct accuracy score was 
highly related to age (—.585). 

As might be expected the three accuracy 
measures were significantly correlated with 
each other. The 3-minute correct with % 
error yielded an v of —.581, 3-minute correct 
with Q error was — 335, and Q error with 

o error was .555. Thus 3-minute correct 
correlated negatively with the other two 
accuracy scores, whereas Q error and %o 
error were positively related to each other. 

In order to compare the organic and 
control groups further, a correlation matrix 


was obtained for each group on the seven 
variables. These correlations are found i? 
Table 3. There were a few correlations which 
seem to be divergent for the two groupa 
The correlations between age and IQ we? 
in opposite directions for the two groups: 
The controls showed an y of —.244 whic 
is significant at the .01 level, The organics 
showed an 7 of .187 which falls short ° 
significance. Yet the difference between thes? 
two correlations transformed to Fisher’s a 
values yielded a z equal to 2.284 which # 
significant at less than the .05 level. Th? 
nonorganics showed a significant relationshi? 
between IQ and 3-minute correct (.295) bu 
the organics did not (—.066), The different? 
is significant with a z of 2.228. 

The organics had a higher correlation Þe 
tween 3-minute correct and % error (= 616) 
than the controls (—.374), The z of 2.00 
is again significant for the 


TABLE 4 


THE CLASSIFICATION Accuracy or Eacn 


VARIABLE BASED on 120 
CONTROLS AND 60 ORGANICS 
” oe ———— EE pre 
Cutoff of rect 
Variable Score # Overlap % Overlap ck ienai 
c o T c o f T 
3-minute complete 42 28 18 46 23 30 26 74 
3-minute correct 37 17 20 37 14 33 21 79 
% error 31 11 33 44 9 55 24 76 
Q error 3 5 48 53 4 80 29 71 
Improvement 0 32 34 37 27 57 37 63 


difference D& 
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TABLE 5 
The CLASSIFICATION Accuracy FOR Six VARIABLES 
Combinep Usinc B COEFFICIENTS 


Group N lap lap 

n = ie 

Controls 120 13 10.83 
Organics o0 8 13.3 

Total 180 21 11.67 


n 


tween the two correlations. The remaining 
correlational differences are not significant. 


Diagnostic Classificatory Power oj the Meas- 
ures 


Table 4 indicates the empirical cutoff 
AN which maximizes differentiation be- 
green organic and nonorganic groups for 
ĉach variable. These cutoff points were deter- 
mined on the present sample. The 3-minute 
Correct score had the highest discrimination 
a 79% correct classification. Next was 
lo error with 76% followed by 3-minute 
ean lete; Q error, and improvement. In 
nich case the controls displayed a smaller 
ĉrcentage of misclassification than did the 
Organics, 

r wy classificatory power of thes variables 
the n singly and in combination $ — 
Drey; o Parison of Tables 4 m ee 
Whr cus study involving 261 subjec Aen 
“ich the present 180 subjects were drav 
th Coefficients were determined for gieh © 
josse five variables as well as for the aie i 
ta geet These B coefficients were ae : 
igs Subjects in the current a a 
Valid, the results are not stric 7 aled A 
much e in nature. The results zeh Fa 
Brou smaller misclassification w e 
bir Ps. When the six variables V oo 
Wd, the controls showed a misclassifi 


0 he 
hak ‘83% and the organics 13.3 7o- POr 
l al Sample the incorrect classification 
L67% 


of these variables 


DiscussioN 


T € results indicate that th 
icantly impaired in speed, 


e organics are 


accuracy; and 


4 ie 
A he autho: 
B © unpublished preliminary study by t 


y effici P S: 
O77 ients were as follow: ae 
% "9, 3-minute complete 02 194, 26! 


Be d improve- 
Tent Or —.02068, Q error —.00624, aP 
04118, 


175 


improvement in visual motor functioning. 
Since the 7» of the 3-minute correct is of 
greater magnitude than that of the other 
scores, it suggests that the organics suffer 
most in the accuracy function. Further in- 
vestigation raises some question about such 
a conclusion. Table 2 discloses an extremely 
high correlation of the 3-minute correct ac- 
curacy measure with the speed score indicat- 
ing that the 3-minute correct is not strictly 
an independent accuracy measure. Since this 
score is also highly correlated with the other 
two accuracy measures, it suggests that the 
3-minute correct taps a combination of the 
speed and accuracy functions. Both the Q 
error and % error with their low correlations 
with speed appear to be fairly independent of 
speed. In view of these findings, a reconsid- 
eration of the hierarchy of 7)’s suggests that 
the speed function is most impaired, closely 
followed by one of the accuracy measures, 
i.e., % error, and then by improvement. 

Since the 3-minute correct score seems to 
be a combination of both speed and accuracy 
as well as having the largest 7p», it suggests 
that a combination of variables or functions 
is more likely to reveal a greater difference 
between groups than any single variable. Evi- 
dence for this inference is pointed up in a 
comparison of Tables 4 and 5. Not only is the 
percent correct classification greater for the 
combined variables, but also the organic and 
control groups show less divergence since they 
are within 2.5% of each other. 

An additional reason for the higher dis- 
criminatory power of the combined scores is 
the inclusion of the age variable. Since the 
two groups were equated for age, this variable 
had no effect by itself in separating organics 
from nonorganics. Yet age did have a definite 
influence within groups upon visual motor 
performance since it was significantly corre- 
lated with the three main variables. These 
correlations reveal that as age increases, 
speed, improvement, and accuracy decrease. 
These findings are similar to Wechsler’s re- 
sults (1944, 1955). The function that age 
serves in combination with the other scores is 
that of a suppressor variable. 

The IQ variable shows a low but significant 
correlation with speed but no relationship to 
accuracy and improvement. These findings 
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are at variance with Wechsler (1955) who 
revealed a correlation of .64 between Vocab- 
ulary IQ and the Digit Symbol subtest. This 
difference can be explained partly by the ab- 
sence of the lower quartile from the IQ range 
in the present study. Another factor which 
may contribute to the lower correlation is 
that organics with negligible correlations are 
more heavily weighted in the present sample 
than would be expected in the general popula- 
tion with which Wechsler dealt. 

The results in Table 3 show three sets of 
correlations to be significantly different for 
the two groups. On age with IQ the two 
groups went in opposite directions. The ex- 
planation may be that merely by chance this 
organic group was more heavily weighted for 
the higher IQs in the older age range than the 
controls. Likewise by chance the controls were 
more heavily weighted for lower IQs in the 
younger age range. The correlation of IQ and 
3-minute correct shows the controls tending 
in the direction of Wechsler’s result (1955) of 
a definite influence of IQ upon the Digit Sym- 
bol subtest score. With the impairment in 
visual motor functioning in the organic, IQ 
seemingly plays very little part in affecting 
the magnitude of his score. The third set of 
correlations that yielded a significant differ- 
ence between the two groups was 3-minute 
correct and % error. The factor which may 
account for the difference is that the organic 
group made a significantly greater number of 
errors thus lowering their 3-minute correct 
scores, 

If we exclude the 3-minute correct score, 
which was found to be a combination of speed 
and accuracy, Table 2 Presents the striking 
finding that the intercorrelations are low for 
the measures tapping the three functions of 
speed, accuracy, and improvement, This sug- 
gests that the three functions are relatively 
independent. The conclusion of independence, 
however, has to be made with certain reserva- 
tions. Since all the scores are derived from a 
single measuring instrument administered on 
one occasion, they are based to some extent 
upon responding to the same items. The speed 
score is based on all the correct and incorrect 
symbol substitutions. Therefore, error scores 
involve an overlap of certain items contained 
in the speed score. Similarly, the improve- 
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ment score has components of both the speed 
and accuracy measures. In order to test fur- 
ther for the independence of these three func- 
tions, a design in which measures are derived 
from experimentally different items will be 
required. F 
It remains for future studies to determine 
how the current results compare with samples 
involving other clinical groups such as schizo- 
phrenics and psychotic depressions. Confu- 
sion, motor retardation, and agitation in these 
groups may have disturbing effects upon the 
speed, accuracy, and improvement functions: 


SUMMARY 


Speed, accuracy, and improvement in vis- 
ual motor functioning were investigated in 60 
organic and 120 nonorganic subjects. A 3- 
minute substitution task was employed t0 
obtain the data. This task yielded one speed 
one improvement, and two strictly accuracy 
measures. 

The results indicated that speed, accuracy? 
and improvement in visual motor perform 
ance were all significantly impaired in the | 
brain injured group. The relatively low intel 
correlations found suggested that the thre? 
variables may be fairly independent and sP& 
cific factors contributing to the more general 
visual motor function or process. A fourth 
variable, age, appeared to be less independent 
since it correlated significantly with the firs 
three variables. The fifth variable, IQ, unlike 
the first four, showed no noticeable influen? 
upon visual motor performance. The implica” 
tions of these findings were discussed. 

The discriminant power of each of th? 
Scores was studied in relation to the number 
of correct classifications of organic and no 
organic subjects. These findings were the? 
Compared with the discriminant power of t t 
combination of all of the scores. The resul 
was that the combined scores yielded a highe 


percentage of correct classifications than 
individual measures. 
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THE INFLUENCE OF CONTEXT ON THE DEPRESSION 
SCALE OF THE MMPI IN A PSYCHOTIC POPULATION 


GORDON W. OLSON 
Anoka State Hospital, Minnesota 


This report is felt to be of current interest 
and value in view of the recent marketing of 
many antidepressant drugs, an influx which 
has created the need to objectively assess 
their effectiveness. Depression is often an es- 
pecially difficult symptom to assess clinically 
because of the variety of manifestations and 
because it is frequently masked by or mixed 
with more dramatic symptoms such as schizo- 
phrenic apathy and hypochondriacal com- 
plaints. The present study was the result of 
a search for a short, reasonably objective, 
and easily procured measure of depression, A 
striking possibility appeared to be the de- 
Pression scale (D) of the MMPI, the pur- 
pose of which was to identify the “state of 
mind characterized by poor morale, lack of 
hope in the future, and dissatisfaction with 
one’s own status generally” (Hathaway & 
McKinley, 1942). The question posed was 
whether the D scale by itself would measure 
the same thing as when the entire inventory 
is administered, a question that is not an- 
swered by previous studies of reliability of 
the D scale. 

Canter (1960) mentions the often-heard 
criticism that different response sets may be 
elicited if items or single scales are isolated 
from a main body of items, but found in his 
study that the D scale appearing out of con- 
text (but in combination with Pz and K) did 
differentiate among suicidal, nonsuicidal psy- 
chiatric, and nonhospitalized groups. This at- 
tests to the validity of the D scale and sug- 
gests that context played a not-too-important 
role in that instance. 


METHOD 


The 60 items comprising the D scale of the MMPI 
were mimeographed as a separate and columns pro- 
vided to check the items as true or false. Fifty psy- 


chiatric inpatients at Anoka State Hospital were ad- 
ministered consecutively the entire MMPI and the 
60-item D scale. The sample contained 30 females 
and 20 males. Subjects (Ss) did not know belo: 
hand that a second procedure would follow the nis 
One-half of the group took the entire MMPI a 
and the other half took the D scale only first. 
were not selected with the exception that many 
had been referred for psychological examination an 
others were seen for routine testing on admission. 
Statistical analysis involved a £ test of the ae 
ence for related means of the raw scores under Uae 
two conditions and the Pearsonian correlation CO 
efficient for the 50 pairs of scores, 


RESULTS AND Discussion 


The mean raw score when the D scale was 
administered alone was 22.1 and it was 221 
when a part of the entire test. The standard 
deviations were 6.1 and 6.7, respectively. The 
t for this difference was .02, nonsignificant. 
The v for all pairs of scores was .99 = .14 
The largest difference found in this sample 
Was six points and the median difference was 
zero. 

While the results of this study do not beat 
on the validity of the D scale as a measure 0 
depression, they do clearly indicate that what- 
ever is measured by the D scale of the MMP! 
can be measured in a Psychiatric populatio? 
without administration of the entire inventory: 


Summary 


The recent influx of antidepressant drugs 
stimulated the search for a short, objectiv 
and easily Procured measure of depressio”: 
The D scale of the MMPI would seem to ful- 
fill these criteria if it could be administere 
Separately, but the question presented w25 
the familiar One of effect of context on 1 
Sponse set. Fifty Psychiatric inpatients welt 
individually administered the entire MM 
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and the D scale only; one-half took the 
MMPI first and one-half the D scale first. 
The correlation of D scores in the two situa- 
tions was .99 and the £ nonsignificant. It was 
Concluded that context did not influence the 
response set and that whatever is measured 
by the D scale can be measured without ad- 
ministration of the entire inventory. 
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SUPPRESSING DISTORTION IN TEMPERAMENT 
INVENTORIES 


JACK HAND anp HERBERT H. REYNOLDS 
Baylor University 


This study was designed to determine the 
effects of “appraisal” and “research” testing 
conditions upon six scales taken from the 
Guilford-Zimmerman Temperament Survey 
(GZTS). 

Subjects consisted of 373 USAF basic 
trainees divided into two groups. In the 
first week of duty both groups completed a 
self-report temperament inventory composed 
of 240 items from the GZTS. Instructions 
for the Research Group were: 


I am studying personality tests and need your 
help. This work has no connection with the Air 


Force. Please do not put any identifying marks on 
your answer sheets. 


This was in addition to the regular instruc- 
tions. These instructions were administered 
by the senior author in civilian clothes, The 
Appraisal Group was told: 


As you know, personality characteristics are re- 
lated to success in the Air Force. The tests you are 
about to take will be a matter of record. 


These instructions were administered by the 
co-author in full uniform (Air Force Cap- 
tain). 

Included in the inventory are the following 
scales: 


DG—8 socially desirable items from the GZTS 
G scale? 


UG—8 socially undesirable items from the GZTS 
G scale 


DR—10 socially desirable items from the GZTS 
R scale 


UR—10 socially undesirable items from the GZTS 
R scale 


SD—a scale designed to measure SD 2 


15D values for all items in the GZTS (except 
MF) were established by another investigator 
(Kelley, 1959). 

2A description of this scale is being prepared for 
publication. The correlation between it and Edwards 
SDS (Edwards, 1957) is .54, 


TABLE 1 


JPS 
COMPARISON OF APPRAISAL AND RESEARCH A 
on Six CLUSTERS OF Tress FROM THE GZT 


ee 
Appraisal Research 
(V=190) (N= 183) f 
ih a: t ratio 
Variable M è SD M SD (MiMe? 
DG 5.25 1.79 484 180 2.20". 
DR 6.28 216 5.67 2.26 2.66% 
UG 3.52 1.78 3.95 1.69 299 
UR 3.56 1.85 443 198 4.39 
DG+UG 877 289 879 276 07 
DR+UR 9.84 3.09 10.10 348 -76 
* p <.05. 
** p <.01 


Each subject received seven scores: DG; 
UG, DG +UG, DR, UR, DR+UR, a 
SD. The DG + UG and DR + UR are score 
from scales composed of an equal num E 
of desirable and undesirable items. 


RESULTS AND DISCUSSION 


Table 1 gives comparisons of group mean 
on the temperament variables. These resu 
indicate rather clearly the effects of differe”? 
instructions and conditions upon the temper” 
ment scores. When defensiveness is sti™\” 
lated the scores are increased. The insig" 
cant group differences on variable DG + 
and DR+UR indicate, however, that 
balanced design eliminates the effects, n 
defensiveness (provided appraisal conditio 
stimulate defensiveness), 7 

The product-moment correlations dor i6, 
praisal group) of SD with DG, DR, 49) 
UR, DG + UG, and DR + UR are 32) “p 
—.34, —.46, 00, and .03 (first four sig” pe 
cant at .01 level) further suggesting that * -f 
balanced design eliminates the influence 
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SD upon temperament scores obtained under 
appraisal conditions. 

_ Probably the most important interpreta- 
tion of this data is that with a balanced 
design one set of norms may be appropriate 
for the two most widely applicable testing 
Conditions, 
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SEDATIVES AND SUGGESTIBILITY IN NEUROTIC PATIENTS! 


J. G. INGHAM ? 
Llandough Hospital, Glamorgan, Wales 


axo J. M. WHITE 
Stanley Royd Hospital, Wakefield, Yorkshire, England 


Ingham (1955) found that neurotic patients 
taking sedatives were significantly more sug- 
gestible than unsedated neurotics. Suggestibility 
might have been increased by sedation in these 
patients, or alternatively the more suggestible pa- 
tients might, though not necessarily intentionally, 
have been selected for sedative treatment. Two 
investigations were done to find out which of 
these interpretations was more likely. 

In the first investigation, 10 male neurotic pa- 
tients were given the same medication (6 grains 
of sodium amytal per day, in divided doses) for 
3 days following admission to hospital. They were 
then divided at random into two groups and took 
part in a 2-day experiment. One group was tested 
(a) after a further day on the same regime and 
again (b) after one day without drugs. For the 
other group, b was followed by a. The arm-move- 
ment test of suggestibility was used, as well as a 
test of static ataxia and arm-movement without 
suggestion. 

This preliminary experiment offered no sup- 
port for the idea that moderate sedation increases 
suggestibility but its value was limited by the 
small number of subjects and by the fact that 
the period without drugs was so short, 


1 An extended report of this study may be ob- 
tained without charge from J. G. Ingham (Medical 
Research Council Social Psychiatry Research Unit, 
Llandough Hospital; Penarth, Glamorgan, Wales; 
United Kingdom) or for a fee from the American 
Documentation Institute. Order Document No. 6543 
from ADI Auxiliary Publications Project, Photo- 
duplication Service, Library of Congress; Washing- 
ton 25, D. C., remitting in advance $1.25 for micro- 
film or $1.25 for photocopies. Make checks payable 
to: Chief, Photoduplication Service, Library of Con- 
gress. 

2 Previously in the Medical Research Council Neu- 
ropsychiatric Research Unit, Whitchurch Hospital, 
Cardiff, Wales. 


In a second investigation, an attempt was made 
to eliminate the difference in suggestibility E 
tween sedated and unsedated patients by E, 
ing them after giving the same amount of Sif- 
tives to both groups for a few days. If the 
ference is a result of sedation, then it should 
possible to eliminate it by such a procedure. E 
on the other hand, the difference results from the 
lection, then it should remain, even when 
groups are equally sedated. 

Forty-two male neurotic patients (15 of th K 
sedated before admission) were tested on aima 
sion to hospital and again 3 days later. Dunt? 
the intervening period all patients receive e 
grains of sodium amytal per day, in divided dos i 
It was again found that, tested on admis y 
previously sedated patients showed significan aa 
greater arm-movement suggestibility than uns f 
dated patients, though both were significa 
more suggestible than a group of 27 normal mai” 
On retest, after 3 days of medication, the sign 
cant difference in suggestibility between PFC’. 
ously sedated and unsedated patients remain? 5 
There was no indication of an increase in oa 
gestibility, following medication, in either ore 
The findings support the hypothesis that we 
was a selection factor operating, whereby 
more suggestible patients were more likely t° 
receiving sedatives, 

Additional results from both investigation 


+ . . s e 
gest that static ataxia increases following ™° 
ate sedation. 


em 


s sug 
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SEMANTIC DIFFERENTIAL RATING OF SELF AND OF 


SELF-REPORTED 


PERSONAL CHARACTERISTICS * 


JAMES E. MADDEN 


Veteran 


Pn semantic differential has been used as a 
Various , of similarity of affective reactions to 
Tated - concepts. When two or more concepts are 
‘act bo oa (have close profile agreement) the 
etms. T be interpreted in clinical psychological 
and “q or example, equating the “self concept 
apiu niye” concepts suggests that the de- 
individu, Concepts are important aspects of the 
Veals ot sense of self. The present study re- 

i e extent to which this exampled interpre- 


tat Baad 
‘on may be justifiable. The semantic differen- 


tia 
tial Profiles of “I, myself” and independently 


Icated aspects of self (self-reported personal 


Cha gies x 
sep teristics) are compared, to see if in fact 

and aspects are rated similarly. In addition 
dy deals with 


show the relation 


en profile agreement an 
Pitt descriptive concept is an 
Were Fee concepts (persona 
Por t erived from Mf scale item 
Present, Te task, the items wer 
Perso ed in the third person singular 
8, a Who: . . .”), All items were rt 
Maye ever, only the concepts derive 
mini’ True by S during an independent ad- 
fo, tation of the items in their usual self-report 

» are considered to represent aspects of his 


1 

An > be ob- 
taingg crtended report of this study May OT 
cal Ct Without charge from James E. Madden (Clini- 


Dita] Sychology Service, Veterans ‘Administration os- 
fin, 2 Chillicothe, Ohio) or for 2 he A 6: 
6542 pocumentation Institute. Order Document a 
toa rom ADI Auxiliary Publications Project, 
ingo, Cation Service, Library of Congress 
microf 25, D. C, remitting in advance 94- 
Daya a or $123 for photocopies. Mas 
OF Cor, tO: Chief Photoduplication Services 
Ngress, ? 

top, . data reported here are from the writer's “i 
h 1956 completed at the University of mea 


aspect of sell. 
1 characteristics) 


d by each 


fee from t 


5 Administration Hospital, Chillicothe, Ohio 


sense of self. When S marks an item True, he 
virtually implies “I am a person who... .” Also 
rated was the concept “I, myself.” 

A set of 15 bipolar seven-step scales (five 
evaluative, five potency, and five activity) was 
used to rate each concept. The square root of the 
sum of squared distances was employed as the 
measure of profile agreement or Distance be- 
tween “I, myself” and each of the other concepts. 

Thirty college students were Ss. The Mf items, 

in their usual True-False form, were administered 
to half the Ss approximately a week before they 
rated the concepts on the semantic differential. 
The remaining half had the reverse sequence of 
tasks. 
The range of Distances for each S was divided 
into tenths. In each S’s data the percentage of 
True items (those which he had independently 
marked True) was computed in each of the 10 
regions. Data were adjusted so that a 50% value 
for a region meant that True items had no greater 
or less tendency than False items did to occur in 
that region. The percentages of True items in the 
regions were then averaged across Ss. 

For all 30 Ss the resulting percentage quantities, 
from Region 1 (closest to “I, myself”) through 
distant from “T, myself”), were 
65.1, 56.8, 51.6, 45.7, 40.4, 33.5, 
21.2, 20.6. The mean percentage of True 


23.9, 21.4, 
jtems in the first five regions is significantly 


larger tha 
five regions, beyo! 
similar for 


probability t V 
pect of self, is seen to be directly related to the 


amount of agreement between the ratings of self 
and the concept. From the data’s negative facets 
(e.g., some False items in close regions and some 
True items in distant regions), some challenging 


theoretical possibilities can be gleaned. 
(Received June 13, 1960) 
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FIELD DEPENDENCE, MANIFEST ANXIETY, AND | 
SOCIOMETRIC STATUS IN CHILDREN + 


IRA ISCOE anp JOYCE ANN CARDEN 


University of Texas i 


The Children’s Manifest Anxiety Scale (CMAS) 
as a measure of anxiety or drive level, a rating 
method of sociometric choice (rating own sex) 
employing three choice criteria as an index of so- 
cial status, and Witkin’s Embedded Figures Test 
“as a measure of field dependence-independence 
were each administered to an entire sixth grade 
class composed of 16 boys and 15 girls. Mean 
age of the girls was 11 years 7 months, for the 
boys 11 years 11 months. Mean IQ was 118 for 
girls, 116 for boys. They were homogeneous with 
respect to religion and socioeconomic class. They 
had been acquainted for at least 7 months prior 
to the study. 

A significant rank-order correlation of .57 (p 
<.05) was obtained between sociometric status 
and field dependence in girls. For boys the re- 
sults were all in the opposite direction, with the 
choice of “class officer” correlating —.51 (p< 
.05) with field dependence. The data suggest that 
popular boys are more likely to exhibit an active 
field analytic orientation and girls a passive field 
dependent one. CMAS scores were not related to 
sociometric status for boys while significant nega- 
tive correlations (p < .05) on all criteria ques- 
tions were obtained for girls. In addition, the 
number of rejections received was significantly 
related to anxiety level, the r being .65 (p< .01). 
It would appear that more frequently chosen 
girls tend to have a lower drive level (anxiety) 
while under chosen girls have a higher level. The 
more rejected a girl is by her peers, the higher 


1 An extended report of this study may be ob- 
tained without charge from Ira Iscoe (Department of 
Psychology, University of Texas; Austin 12, Texas) 
or for a fee from the American Documentation Insti- 
tute. Order Document No. 6541 from ADI Auxiliary 
Publications Project, Photoduplication Service, Li- 
brary of Congress; Washington 25, D. C., remitting 
in advance $1.75 for microfilm or $2.50 for photo- 
copies. Make checks payable to: Chief, Photodupli- 
cation Service, Library of Congress. 


drive level and field dependency was positive E 
not significant. For girls an r of — .60 was a 
tained (p < .01). This would indicate that ia 
independent girls tend to be more anxious thad 
field dependent ones, Level of intelligence W 


her drive level. For boys, the correlation betwee? 
o| 
not significantly related to any of the soc! 


e 
metric choices, nor to scores on the CMAS. hae 
support for a significant negative relationship ob 
tween field dependence and intelligence was 


e- 


d 
tained for girls but not for boys. dings 
The results offer some support to the find! d 


; : 1 a 
dependence-independence dimension and pera 
ality characteristics. In the present study, at is 


age and cultural level employed, the girl who ? 


x $ l 
of Witkin and his associates, in regard to the fie 
to 


an active initiator and organizer is not likely Y 
enjoy high social status with her peers. In oR 
trast, the relatively field independent boy is ne 
likely to gain wider acceptance by his classmat?® 

The descriptions Witkin uses to character!” 
field independent vs. dependent persons men 
well represent the kinds of behavior our mid 2 
class culture fosters and rewards at these age 
Boys are expected to be somewhat aggressive: e 
rect, and analytic, while girls are taught a ae 
submissive, conforming, “ladylike” type of ins 
havior. The girl who identifies with this role ane 
acceptance and is subjectively aware of fear l 
discomforts as picked up by the CMAS. et 
girls, an analytic (field independent) mode of PY 
ceiving results in less popularity and more ana 
iety. The reversal of the relationship betwe r 
field dependence and social status in boys Peig 
haps emphasizes the cultural rewards for of 
exhibiting initiative. The fact that the tyP® 
behavior examined in the current study have 5° 
correlates to modes of perceiving points 204 
to the intricate relationships that exist betw 
perception and personality. 


(Received August 1, 1960) 
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THE ROLE OF THE INTERNSHIP IN THE RESEARCH 


TRAINING OF THE CLINICAL PSYCHOLOGIST 


JULES D. HOLZBERG* 
Connecticut State Hospital 


Sees in clinical psychology has con- 
E of two aspects : (a) a minimum of 3 
Tecejy at a university where the student 
cholo es a generic education 1n basic psy- 
isn , training in theory and techniques 
en to clinical psychology, and inset 
Pract lence in research; (b) at least 1 year 0 
ĉa] Aaron experience (internship) in a clini- 
acad enter to provide substance for the mre 

. ag aspects of the student's alinia 
taine Within this organization of ane 
aiar the university has been Tiere Ai 
Š ae ly responsible for the training o a 
a aSa psychological researcher, while 
ji ias iship is generally conceived of pan 
8 only minimal support for the researc. 


as fies 

Pects of a clinical student’s training. 

owever, one may accept this basic re- 
i hile at the 


$I E IEN 

j sane sibility of the university W 

Woe time questioning the adequacy of a 
.'Ception’ that denies to the internship & 


i as, 
evficant responsibility in the training of 
tician as a researcher. The universi 

E i i from the 
' Doing Westionably better equipped 


Droyj a 


4 Statist 
Popul 
kills 


ical analyses, in meth 
ations, and the numerou 
pe the competent researcher- 
hip Wever i ship center he 
4 fant role oe tis research bhai 
he east two signi- 

student to 


E Feldman, 
eita 
author 


1 
Th 
Who € author is indebted to colon ad eae 


the manuscript an 
es table document. However, : 
full responsibility for the idea 


the meaningful research problems in the 
clinical field and to demonstrate how clinical 
skills and techniques may be adapted to re- 
search goals, while providing a framework 
that fosters research on clinical problems 
without destroying their clinical significance; 
and (b) to contribute to the building of a 
self-concept that effectively integrates the 
role of “researcher” and “clinician.” The 
first of these aspects is by no means insig- 
nificant, for the university may, because of 
its frequent insistence on the absolute 
“researchability” of a problem, force the 
clinical student into a research area where 
the problem is “neat,” but the clinical sig- 
nificance quite meager. This is not meant as 
a criticism of university training, since the 
author recognizes that the function of the 
university is to provide generic training in the 
development of research skills, and there may 


be real merit in focusing this training on 


“researchable” problems. We must, however, 


be prepared for one possible consequence of 
this training, i.e., this criterion of research- 
ability may become so rigidly fixed that the 
clinician may never undertake research on a 

because it is rare indeed 


clinical problem, 
that a clinical problem can hope to attain the 
level of researchability that meets the stand- 


ard set for students in the typical university. 

The second research training function of 
the internship is related to the author’s ob- 
servations that clinical interns frequently 
come to the internship with a strong sense 
of inferiority as a researcher. Whether this 
is a function of the preselection of clinical 
students (in that students entering into the 
clinical area may do so because of intellectual 
and personality reasons that are incompatible 
with a research role) or whether this is a 
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function of the absorption of attitudes 
toward the clinician as a research worker 
that unfortunately permeate many of our 
universities, the result often seems to be 
that when the clinical student enters into 
the internship, he does so with a self-concept 
‘that is relatively weak with regard to the 
integration of research as a practicing role. 
= The internship frequently does not help to 
counter such feelings of inferiority. It may 
fail to provide a “culture” which rewards 
the development of a research role for the 
clinician by encouraging other role behaviors, 
such as that of diagnostician and therapist. 
The practicing clinician in the internship 
center may further contribute to this self- 
concept by failing to provide an adequate 
model of the clinician as a research worker. 
He is usually not involved in research and 
may actually give voice to antiresearch atti- 
tudes, so that we have the vicious “cultural” 
cycle of the clinician, having learned to reject 
the role of research for himself, now fostering 
the same attitudes in the students sent to 
him for training. 


Tur Acquisition or a RESEARCH Rorre? 


I have been using two concepts which 
should be more precisely defined. One is that 
of “role” of a researcher, i.e., those actions 
which are performed or not performed to 
confirm the occupancy of the research role. 
The second, “self-concept,” may be defined 
as the values used by a person to describe 
and evaluate himself, and here I am specifi- 
cally referring to self-descriptions of the 
clinician with regard to values pertinent to 
being a research worker. These two concepts 
are functionally quite interrelated since atti- 
tudes toward the self affect the kind of role 
that will and can be played, and conversely, 
the success with which one plays or does not 
play a particular role will in turn affect the 
self-concept. 

Two methods of role learning may be 
delineated, i.e., intentional learning and in- 
cidental learning. Intentional learning is 
explicit learning and the intentional learning 
of a research role occurs primarily in the 


2A number of the ideas presented in this paper 
were stimulated by Sarbin’s (1954) excellent dis- 
cussion of role theory. 
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university. In incidental learning, the indi- 
vidual adopts “the prevailing pattern.” It 
is the author’s thesis that incidental learning, 
much of which takes place in the clinical 
center, plays a significant part in the ultimate 
development of the self-concept of the 
clinician. Incidental learning occurs essen- 
tially through a process of identification 
(unconscious) or imitation (conscious). This 
places great emphasis on those in the 
emerging clinical psychologist’s environment 
who have securely integrated the role of 
researcher into their professional self-concept- 
It is here that an important defect exists 1" 
the opportunities for learning a research role 
since the figures who securely integrate the 
research role into their self-concepts are fre- 
quently unavailable or minimally available 
in the internship. For the most part, there 
are figures who are diagnosticians, therapists: 
administrators, etc. The intern can identi] 
with figures that are not in his immediate 
visual field, such as significant researchers in 
the field of psychology. However, we ae 
perhaps failed to emphasize the fact tha 
imitation, which may be more important ! 
the development of the professional self- 
concept, does require the figure to be in th® 
visual field. j 
This has been implicitly recognized m 
many clinical centers with the result th@ 
more clinical centers are today resorting 
the use of research consultants who serve um 
important role of providing figures with | 
secure self-concept that incorporates the "° 
of researcher. These are, for the most pa 
research psychologists from universities ya 
visit the internship center at varying inte 
vals, spending several hours on each visit an 
advising on theoretical conceptualizatio® 4 
research designing, statistical analyses, an 
other issues pertinent to research. But aie 
the use of consultants may be insufficier 
Frequently, these are fleeting figures w. 
appear irregularly, who may not actually | a 
come involved in research within the clini 
center, and who are often not clinicians ” s 
are the prototypes of those academic f8" s 
who have originally contributed to feel vb 


of inferiority about the clinician’s rese?', 
role. 
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r This may be a considerable oversimplifica- 
tion. One may still have figures available who 
Provide appropriate models for identification 
or imitation, but there may still not be a 
asic motivation for learning a research role. 
From where does the motivation to learn and 
eae the role of researcher stem? Such 
ation may derive from factors such as 
aa rewards, status, advancement, and 
not A extrinsic stimulants. However, we can- 
E the fact that there may be more 
tion Ps factors operating to provide motiva- 
Reg; > the learning of a research role. 
tam and Brody (1955) have emphasized 
«arch motivation as being derived from 
a etfal unconscious drives to gratify infan- 
curiosity and the wish for omniscience 
aot Kubie (1953) has carried this 
Scions and has attempted to relate — 
Tesear, tings to the various aspects is 
Stay: such as the problem selected for 
ized, the specific research techniques uti- 
anq ` the hypothesis selected for verification, 
T Other aspects of the research process. 
arep adition to learning the skills of m 
in addi including clinical research skills, and 
ang Phe to learning the role of researc r 
Conce, €ctively integrating it into ones s¢ i 
thea a” there is another type of learning 
'S a joint responsibility of the university 
the clinical center. This is the learning 
b feo of research. I refer here to m 
to is ‘ons of the researcher to his age or 
Emp oy. ofession, to the organization whic 
Ofte YS him, to his collaborators, etc. Too 
ig observes the researcher, be he 
a or not, whose competence in terms 

high Wedge and skills is obviously of a 
Order, but who has failed to integrate 
alid system of ethical values into his 

activity. 


and 


> 
PRoptems or Rore ENACTMENT 


blems 
d to 


* multiplicity of roles, eg» 
s therapist, teacher, researchist, etc. At 
of d there may be conflict between certain 
Se roles, which may then result in the 
'shing of one of the roles in order to 
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resolve the conflict. Where one of the roles 
is that of research, this frequently leads to 
the relinquishing of the research role because 
the clinical center frequently supports this 
role less than others. 

Role conflict occurs if the individual 
occupies two positions simultaneously, and 
when the role expectations of one position 
are incompatible with the role expectations 
of the other. It is rare indeed that situations 
arise which present this type of role conflict 
as it pertains to the practice of research. 
Where one finds the verbalization of such 
role conflict, one must suspect that there 
exists another explanation, other than a real 
role conflict, since it is rare that the indi- 
vidual is expected to perform a research role 
and any other role simultaneously. The role 
of clinician and the role of research worker, 
even where these are successfully integrated, 
are not performed simultaneously, but are 
performed at different times, in different 
settings, in relation to different individuals, 
etc. The author views the multiple roles of 
the clinical psychologist not as mutually in- 
compatible, but mutually complementary. 
The ability to successfully perform a multi- 
plicity of professional roles permits one to 
enrich the particular role that the psycholo- 
gist may be enacting at a particular time. 

It has frequently been stressed that the 
attitudes and skills required for research are 
significantly different from those required for 
clinical activity. The problem here is not that 
there is a basic incompatability between 
research and clinical activity, but that certain 
types of research, usually of the experimental 
laboratory type, are strikingly different from 
what is involved in clinical practice. How- 
ever, there are many problems which cannot 
at the moment be studied effectively in a 
strictly experimental laboratory manner, and 
here the clinical method, with its latitude, 
with its less standardized approach, with its 
greater subjectivity, can be marshaled to 
fulfill a research objective. The problem is 
one of adapting the clinical method so that 
it can become a useful research method. In 
fact, the author has stressed elsewhere the 
fundamental resemblance of the clinical and 
the scientific methods (Holzberg, 1957). 
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It is not being suggested that the clinical 
psychologist must enact all of his roles with 
the same degree of intensity or of involve- 
ment. It is the rare individual who can main- 
tain intense affective involvement and intense 
expenditure of physical effort in all of the 
roles that he may be expected to perform. 
The clinician need not have the highest 
involvement in research, but somewhere in 
the repertoire of roles there must be inte- 
grated into the clinician the availability of 
the research role. Clinical psychologists vary 
in their enactment of the research role such 
that each may be located on a continuum 
extending from almost automatically enacting 
the role, to the other extreme where the 
clinician is very self-conscious of his own 
role enactment. The problem is how to make 
the clinician, and particularly the clinical 
student, become less self-conscious of the 
enactment of the research role. 

To perform a role, the individual must 
clearly know the expectations of those with 
whom he interacts in this role, Unless the 
internship staff conceptualizes the intern’s role 
as incorporating research, the intern’s enact- 
ment of the research role may present many 
problems. Thus, he may act out a role which 
others may consider inappropriate, if not 
actually bizarre. Of crucial importance here 
are the overt acts of others, rather than 
verbalizations. If others are doing research, 
if research is being actively supported in the 
agency, this is a far more significant factor 
in revealing the attitudes toward a research 
role than mere verbalized support. 

Where the professional structure demands 
a multiplicity of role performances as in 
clinical psychology, there must be “flexi- 
bility” in the individual, so that he may 
readily be able to shift from one role to 
another. It is apparent that this is one 
dimension of the self that may make for 
difficulty in the enactment of the research 
role, simply because the individual may be 
lacking in this capacity for flexibility. At the 
same time, the individual must possess suf- 
ficient capacity to erect barriers around a 
particular role being enacted, so that the role 
does not shift inappropriately. 
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DEFENSIVE REACTIONS TO ResEarcH ROLE 
CONFLICTS 


If the actions and qualities of a given 
role are congruent with the self-concept 
maintained by the individual, this enhances 
the probability that the individual will i 
able to perform in a way consistent with t s 
role expectations. It is the thesis of the writ 
that the incongruence of the role of the Le 
searchist with the self-concept of the clinici 
is a central problem, This is in essence ® 
situation of conflict, and like conflicts whi F 
are not directly resolvable, leads to tena 
and resort to various defensive operation 
One of these is rationalization, which is E 
exemplified by the resort to the argumê 
that the research role is incompatible W! 
the clinician’s role. 

Another defensive reaction is tive 
projection. This manifests itself in destrura 
and nihilistic attitudes toward the resea 
of others. This may take the form of 4 
psychologist criticizing another reseach 
technical competence, his knowledge of 
area, etc., all of which are projections 0 
own inadequacies. ale 

Still another defense is that of deni@ 


1 


an y de 
failing to recognize research problems or are 
nying that problems others are posinks {0 


real research problems. This person seem f 
be saying: “If I do not recognize rese% 
problems, I cannot be expected to 40 
search.” sanatio 
Some clinicians may resort to identifica ay 
as a way of resolving the conflict. They ” 4s 
identify with strong clinical figures, SU ave 
psychoanalytic psychotherapists, who pi al 
considerable prestige in terms of their cli ull 
endeavor. The clinician who can succes pl 
make this identification may comfo! eu 
remove himself from a research role, si?° pi 
can model his behavior after one WPO 
acceptable status and prestige. bie d 
Some clinicians remain on the fring yet 
research. They serve as “advisers” to ° 
but never get involved in research f 
quently, they perform a significant 10° psi 
advisers, which js testimony to the fa¢ di 
they have certain competencies reg at 
research which should lead to reseat© 
tivity. These individuals are so blocke 
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Sense of inferiority and the potential feeling 
that their research activity will not succeed 
that one finds this tragic discrepancy between 
pee ability in research and an inability 
E aR in a research role. At the other 
lisse m is the “dilettante” who frequently 

ave some research interest and talent, 
ie Who wants to research every problem here 
on This frequently masks the in- 
Who ity and inferiority of the psychologist 
asi 1S anxious that his research effort yield 
i Snificant contribution, and so to buttress 
S own security seeks to embark on a second 
an Project simultaneously. But, even 
must a not assuage his anxiety so that he 
MEP urn now to a third project, hoping that 
ivi Pos Way a multiplicity of research ac- 
whi T will at least yield one research project 
©” will earn him success. This multiplicity 
Sinks frequently leads to activity but no 

amed research. 


ri 


Tur CLINICIAN As A RESEARCHER 


When stion that frequently is posed is 
expectas all clinical psychologists should be 
Very A to do research. Certainly, at the 
vidua] oe we should expect that the indi- 
Psych ae has received a PhD in ane 
n ea °8Y approack his clinical tasks wl 
that tation geared to recognize problems 
in oe unanswered and questions seek- 
Shoy cutions, Perhaps this is as much as we 
Clinica] expect from the bulk of practicing 
dize a Psychologists—this capacity to recog- 
Solutj nd be aware of the problems needing 
hus, an intern may conceivably 
> Successful experience with regard to 
if he can spend his year within a 
Search that, while it does not practice re- 
talyzi at least is self-questioning and self- 
technies with regard to its beliefs and its 
itthon However, it is the thesis of the 
oe ‘hat the more significant resear' 


eri 5 y 
Comp ace during the internship will be ac 
where shed where the setting is itself one 


h Ro © Psychologists are actively involved 
b The arch activity. 

oe traj ttoversy as to whether we should 
peltate “8 People to do research or to 
Ven i Published research still continues- 
We could agree that this was 2 
conflict, in what way would the 
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training for doing research be different from 
the training that would enable one to evaluate 
research? Is it possible to evaluate research 
without systematic involvement in research 
that brings home to one realistically the im- 
portant issues with which one is concerned in 
research activity? 

It is important to stress that there are 
many skills that enter into being a well- 
rounded research person. Among these are: 
the capacity to recognize meaningful research 
problems; the capacity to engage in the 
creative thinking involved in building hy- 
potheses and deducing implications from 
them; the capacity to integrate and utilize 
theory to refine a problem; the capacity to 
develop a research plan with appropriate 
controls that would make possible the testing 
of the hypotheses; the capacity to assess the 
nature of populations and the types of 
samplings to be made; the capacity to select 
the appropriate techniques and instruments 
to be used in the measuring process of the 
research; the capacity to deal with subjects 
not as clinical but as research entities; the 
capacity to record faithfully responses made 
by subjects; the capacity to tabulate, score, 
and analyze the data resulting from the 
research; and the capacity to communicate, 
both in written and oral form, the nature 
of the investigation and the results. Clearly, 
there are few people who possess all of these 
skills to a sufficient degree to qualify as the 
well-rounded research clinician. Here indi- 
vidual differences must be recognized and 
accepted. Individuals vary quantitatively in 
the extent to which they possess the various 
skills that have been enumerated, and a 
place in research exists for all people trained 
as clinical psychologists, each one making a. 
contribution commensurate with his capaci- 
ties and within his limitations. Group re- 
search is perhaps a prime way in permitting 
every clinical psychologist a research role.. 


RESEARCH TRAINING FOR THE INTERN 


It seems imperative that the setting in 
which the intern receives his first major 
clinical experience be one in which research 
in all of its aspects is a vibrant part of the 
program. This is especially true with regard 
to thinking about research. Toward this end, 
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we have found it advisable to have frequent 
conferences devoted either to ongoing re- 
search or published research. Even in those 
settings that may not be able to integrate 
internes into ongoing research, it seems 
advisable that this provision for regular 
discussions should become institutionalized 
as part of the program. 

One problem that presents itself in many 
clinical settings is that of providing time for 
research activity in the face of service de- 
mands. We have utilized the technique of re- 
leasing the intern every third week from diag- 
nostic responsibilities in order to permit him 
to work on research. The importance of pro- 
viding time is that it is a concrete demonstra- 
tion of the department’s and the agency’s 
attitudes toward research. This open accept- 
ance of research activity as a legitimate func- 
tion of the clinician will minimize the anxiety 
or guilt in the individual who feels that he is 
sacrificing services for patients by engaging in 
research. 

Another problem with which we have strug- 
gled has been that of organizing interns so 
that they may participate in group research 
which provides a richer learning experience 
for the intern than he can normally expect 
from individual research, The opportunity for 
continued conferences on the group research 
introduces a significant dimension into the 
training of the intern that can hardly be 
matched when the intern is working alone on 
his own problem, even where he is being 
supervised. Working in the group on a group 
research does not minimize the opportunity 
for individual supervision or for contact with 
consultants, but rather adds to this the in- 
volvement with a group of other psychologists 
with whom the intern can interact on research 
in a continuous way over a year. Besides the 
potentially greater intellectual stimulation 
made possible by group research, there is the 
support that is provided in dealing with the 
many complex problems that arise in the 
course of research activity. 

The use of research consultants is felt to be 
a very valuable part of the research training 
experience for the intern. Not only do they 
add to the research atmosphere of the depart- 
ment, but also these people, when carefully 
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selected, can provide genuine help on research 
problems from the theoretical conceptualiza- 
tion to the technical aspects of designing the 
research and analyzing the data. The use of 
these consultants may also help the intern t0 
understand that there may not be as great 4 
cleavage between the university research psy- 
chologist and the “field” clinician as many 
interns seem to feel. By observing the fact 
that these two can communicate, share, 4M 
sometimes collaborate to their mutual benefit, 
the intern learns that there need not be 
sharp schism between clinical activity and re 
search as it is often defined in the university 
research center. dis 

A problem that must frequently be facet } 
the latent feeling in some interns that, if aS 
engage in agency research, they will someka 
be misused. This misuse may occur at a 
levels: (a) The intern may fear that he Me. | 
be relegated to tasks that are either eee, 
ing or more appropriate to an individual he 
less training. This suggests that, in providi"? | 
a training experience in research for the n 
tern, it must be an experience that will 
truly an educational one. He cannot be "s t 
simply as an assistant or as a clerical worker 
but must share directly in the research a 
its conceptualization to its execution. (0) d 
intern may fear that he will not be rewardes 
for his contributions to the research in pi 
form of authorship. In clinical activity; "7 
concern is not present. Whether he is PY. 
forming as a diagnostician or therapist, oe 
intern is explicitly recognized for his servic”; 
Since the diagnostic workup and the reco" gle 
treatment of the patient in the medical iw 
bear his name, this serves to provide the *, 
tern with formal recognition of his labors: t 
research, this immediate recognition 15, i 
readily apparent. It has been our expeti” 
that with some interns this is a signif% 
concern that may contribute to the inter 
reluctance to become involved in hospit@ pe 
search. We have found that it is well t° ye" 
explicit with interns with regard to the | 
wards that they may expect from partiy 
tion in the research, This has most US 
meant that the intern is accepted as a nip: 
orator from the point of view of authors” syk 

With the provision for the intern to Y w 
on a clinical research problem during t 


sed 


lab” 
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ee carries with it a requirement 
3 er research should be supervised by 
Sos on who not only can assess the research 
pling, et echnical point of view (design, sam- 
clinical c.), but can also make the research as 
ivan + Aap, ie as possible. It is the rare 
regard á © can function independently with 
already o research activity, even when he has 
y completed his doctoral dissertation. 
a T recently introduced, on a trial basis, 
provid rch training program for interns which 
research for the joint supervision of interns 
va ou by a university experimentalist and 
o clinician, 
al of the responsibility of the research 
ndin, #8 to communicate his research and its 
form i effectively both in written and in oral 
at o t is part of the training of the intern 
expected | he is involved in his research he is 
Brou ed to prepare reports to be presented at 
À P conferences at which consultants may 
Present, 


7 CONCLUSIONS 
t may be possible to reconceptualize the 


Prob : 
f lem of how one trains the intern to per- 
a research role effectively. In a sense, 

s to do 


researc), Say that the individual learn 
involved Most effectively when he Is 8 
ea in this process of learning. Tha 
activity. first by feeling that research is an 
Y personally and professionally rewarc 


lobally 
at is, 
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ing and by feeling comfortable in performing 
this research role because it does not conflict 
with other roles or with his own self-concept; 
he learns by thinking research, by being in- 
volved in activities that will permit him to 
extend his intellectual horizons with regard to 
research and particularly clinical research; 
and finally, he learns by doing research, by 
actually becoming involved in research ac- 
tivity. Our emphasis here has been that the 
internship’s responsibility with regard to de- 
veloping the research role of the clinician 
must be to permit the intern to begin to feel 
like a research person, to provide experiences 
that will extend his thinking about clinical 
problems and methods for dealing with them, 
and to provide an experience of doing in 
which he may practice the skills of a research 


worker. 
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COMPARABILITY OF INTELLIGENCE QUOTIENTS 
OF MENTAL DEFECTIVES ON THE WECHSLER 
ADULT INTELLIGENCE SCALE AND THE 1960 

REVISION OF THE STANFORD-BINET? 


GARY M. FISHER,? BEVERLY A. KILMAN, axypj ANNA M. SHOTWELL 
Pacific State Hospital, Pomona, California 


A number of studies comparing the 
Wechsler-Bellevue (W-B) and the Wechsler 
Intelligence Scale for Children (WISC) with 
the Stanford-Binet (S-B) have been reported. 
Correlation coefficients between IQs from 
these Wechsler scales and the S-B have been 
of the same order (.6 to .9) in normal, neuro- 
psychiatric, and mentally retarded popula- 
tions (Alderdice & Butler, 1952; Benton, 
Weider, & Blauvelt, 1941; Goldfarb, 1944; 
Gothberg, 1949; Mitchell, 1942; Nale, 1951; 
Sandercock & Butler, 1952; Stacey & Levin, 
1951; Wechsler, 1944), Although some stud- 
ies have indicated IQs from the WISC to be 
slightly higher than those from the S-B, the 
two instruments give fairly comparable scores 
(Frandsen & Higginson, 1951; Gehman & 
Matyas, 1956; Harlow, Price, Tatham, & 
Davidson, 1957; Weider, Noller, & Scramm, 
1951). A number of studies have indicated 
that W-B IQs are consistently higher than 
S-B IQs of older subjects (Halpern, 1942; 
Mitchell, 1942) and that this discrepancy be- 
comes larger as age increases and intellectual 
level decreases (Bensberg & Sloan, 1950; 
Kutash, 1945; Mundy & Maxwell, 1958). 

Only one study, that of Wechsler (1958), 
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compares the Wechsler Adult Intelligent’ | 
Scale (WAIS) with the S-B. In a sample 

52 male prisoners, aged 16 to 26 years, 
mean S-B IQ was five points higher than a 
mean WAIS IQ, and the correlation bein 
the two IQs was .85. The present study c 
pares adult hospitalized retardates of bits 
age and intelligence level as to their IQS a 
the WAIS and the 1960 revision of the Sta 
ford-Binet Form L-M. 


METHOD d 
The sample consisted of 180 mentally retards 
subjects in three California state hospitals who “gial 
18 years or older and who had a diagnosis of fam e 
or undifferentiated mental retardation. The saree 
was classified into four unequal age groups and i 
unequal IQ categories, in order to obtain 12 od 
sized subgroups of 15 patients each. The age ere: 
ing by years was: 18-34, 35-44, 45-54, and 5°71) 
The grouping by intellectual level (based oP Jow! 
average of WAIS and S-B IQ)4 was: 46 and be 
47-54, and 55 and over, and 
Half of the subjects were given the WAIS first dif? 
the remainder the S-B first. For a given subject un 
ferent examiners administered the two tests. A cins 
terbalancing procedure was employed in assig for 
subjects to the two testers in order to comtt© os 


e pi 
possible examiner difference, To control for 7 
3 7, ` We 
Thus subjects with known brain damage 
excluded from the study. 


ih 
On the WAIS, a subject must achieve a m ipe 
Total Scaled 


score of 11 to obtain an IQ frovevind 
table of norms, Consequently, any subject achi“ ppe 


less than this minimum score was excluded ror 
sample since only Prorated IQs could be estim 10" 
and presumably the WAIS would not be an 3P 
Priate test for such a subject. J we 
*This procedure for determining IQ lev. sp 
adopted in order to circumvent the problem wo | 
tistical regression toward the mean which 


z est 
occur if IQ level were determined by only one £ 
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Comparability of WAIS and Stanford-Binet 
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TABLE 1 


COMPARISON OF WECHSLER ADULT INTELLIGENCE SCALE AND 
STANFORD-BINET IQs For EacH AGE LEVEL 


Corre- 

WAIS IQ S-B IQ lation 

WAIS 

= Age Mean SD Range Mean SD Range ? value &S-B 
18-34 58.42 8.40 45-77 44.02 9.25 26-68 15.00 -736 
35-44 59.44 10.20 45-96 45.84 10.99 31-75 12.04 .752 
45-54 60.29 8.31 48-76 43.51 8.61 26-68 18.64 752 
55-73 63.56 9.88 51-87 41.00 8.33 26-57 23.92 177 


Whi s 
ee might occur in intellectual functioning over an 
tests a period of time, all subjects were given both 
foie ee a 12-month period, with the great ma- 
m of subjects receiving both tests within a 4- 
nth period. 
Rane data that might help in developing an 
en esis regarding the relation of social com- 
p Y to intellectual functioning on the WAIS and 
cale F subjects were administered the Vineland 
n aget, Social Maturity (Doll, 1953). In addition, 
vio Pted form of a Scale of Minimal Social Bo- 
Minist, (Farina, Arenberg, & Guskin, 1957) was ad- 
Score ered, but because of the lack of variability in 
of {> from this instrument no use could be made 
© results, 


hyp 
Det 


S 


RESULTS 
Tor analysis of variance of the difference 
forme, Petween WAIS and S-B IQs was per- 
Signinn. The results indicated that age was 
‘cant in determining the magnitude of 
'Screpancy between the two IQs (F 


Te 4; df =3; p< .001), but neither IQ 
age „O7 the interaction between IQ level an 


Weick 
aS significant.’ 


ands T 1 shows the comparison an Br 
fie level. Of the 
Subject Qs for each age leve pi 


hi examined only 3 had 
ie than their wile TQ, the largest a 
Signi © being five points. The ¢ values for the 
lateg ‘cance of the difference between Corre” 
IQ means indicated that the mean Wats 
leve) > Significantly larger (beyond the 00 
Corre] pat the S-B IQ at each age level. The 
ation coefficients between S and 


` 

A 
Verbaj gyes of the difference scores between war 
Wice and S-B IQ, and between WAIS Perfor 
pte vi and S-B IQ were also performed. The Ss 
tence “tually identical with the analysis of the dit- 
ever, Scores between WAIS and S-B IQs. Moi 
pe ese two sets of difference scores were of the 
ul Se: “shitude as difference scores between 


© IQs and S-B IQs. 


S-B IQs ranged from .736 to .777 and a chi 
square test (Edwards, 1950) indicated that 
the differences among them were not sig- 
nificant (x? = .20; df = 3; p > .95). 

Table 2 sets forth the difference scores be- 
tween WAIS IQ and S-B IQ by age level. The 
t tests indicated that the mean difference 
scores for the three younger age groups were 
significantly smaller (< .01) than for the 
oldest age group. Between the ages of 18 and 
54 years, the WAIS IQ averaged 15 points 
higher than the S-B IQ, whereas in subjects 
over 55 years the difference averaged 23 
points. The standard deviation of the differ- 
ence score was approximately 6. The WAIS, 
unlike the S-B, takes systematic account of 
the lower performance of older adults; hence, 
the older the adults (beyond a peak between 
25 and 30) the lower the absolute level of 
performance and hence the larger the dis- 
crepancy between S-B and WAIS. As the 
WAIS standardization population included 
adults of a wide age range whereas the S-B 
did not, it would be logical to assume that the 
WAIS IQ is a more accurate measure of in- 


TABLE 2 


DIFFERENCE SCORES BETWEEN WECHSLER ADULT 
INTELLIGENCE SCALE IQs AND STANDARD-BINET IQs 


Difference Scores 


Age Mean SD Range 
49-34 1449 «6.49 5 to 32 
3544 3.00 7.53 —5 to 27 
45-54 16.56 5.75 5 to 32 
55-73 22.56 6.28 10 to 36 
‘ote.—For each subject, the S-B IQ was subtracted from 


N 
the WAIS IQ. 


194 


G. M. Fisher, B. A. Kilman, and A. M. Shotwell 


TABLE 3 


COMPARISON OF IQs FROM THE WECHSLER ADULT INTELLIGENCE SCALE AND STANFORD-BINET WITH 
j SOCIAL AGES FROM THE VINELAND SCALE OF SOCIAL MATURITY 


Correlation 
Coefficients 


WAIS S-B 
Vineland Scale of Social Maturity IQ & IQ & 
WAIS S-B Social Social 
Age Mean IQ MeanIQ Mean Age SD Range Age Age 
18-34 58.42 44.02 10-2 1-6 5-6 to 13-3 .265 322 | 
35-44 59.44 45.84 10-2 1-9 5-5 to 13-0 B38 171 
45-54 60.29 43.51 9-9 2-0 4-0 to 12-10 038 433) 
55-73 63.56 41.00 9-7 2-1 5-5 to 13-3 


telligence. It has been suggested, however, 
that there is a spurious allowance for “nor- 
mal” deterioration with age for retardates on 
the Wechsler scales (Bensberg & Sloan, 
1950). Until validity studies, preferably of 
the longitudinal type, of the WAIS with re- 
tarded populations are made, the problem of 
differential rate of intellectual decline depend- 


ing on level of intellectual functioning re- 
mains unanswered. 


The regression equations for predicting the 
IQ on one test from that on the other for the 
three younger age groups appeared sufficiently 
similar to warrant the use of the following 
Tegression equations for subjects between 18 


and 54 years: 
Predicting WAIS IQ from S-B IQ: 


Y = 68X + 29.15 
(SE. = 6.15) 
Predicting S-B IQ from WAIS IQ: 


V¥ = .79X — 2.45 
(SE. = 6.61) 


For subjects 55 years and older, the following 
regression equation should be used: 


Predicting WAIS IQ from S-B IQ: 


Y = .92X + 25.84 


(SE. = 6.22) 
Predicting S-B IQ from WAIS IQ: 
Y= 65X — 31 
(SE, = 5.25) 


534 670 


a 


In order to gain some knowledge of a 
relation between social competency and t 4 
IQs from the two intelligence scales, cong 
parisons of the Social Ages (SA) from be 
Vineland Scale of Social Maturity were ma | 
with the IQs from the WAIS and the A 
These data are presented in Table 3. cy 
square tests indicated that the correlation © 
efficients between WAIS IQ and Vineland É, 
and between S-B IQ and Vineland SA W$ 
homogeneous for the four age levels M 
= 5.249 and 9.264, respectively; df = 7 
> .01). For the total sample the correlat! 5 
coefficient between Vineland SA and W ad 
IQ was .450, and between Vineland SA on 
S-B IQ was .405. A nonsignificant ¢ value a 
-939 for the significance of the difference | 

| 


tween correlated correlations indicated t ell 
the WAIS and S-B correlated equally W 4 
with Vineland SA. The data shown in Tab a 
Suggest a trend for WAIS IQs to increase; on 
S-B IQ: 

age. Accordin, 
obser 


s and Vineland SA to decrease; nes? 
g to analyses of variance ttf 
ved trends all proved to be nonsig 


cant at the .01 level (WAIS: F = 2S Sot 


F=201; Vineland SA: F= 1.29). j 
thorough investigation would be necessary. oy 
order to answer the question of the rel@ a 
between social competenéy and intelligenc? 
measured by these two scales. 


SUMMARY td 
i 


eb 
This study sought to determine the on 
age and level of retardation on the ul 
Parability of IQs from the Wechsler 4; pê 
Intelligence Scale and the 1960 revision ° 


of 


Comparability of WAIS and Stanford-Binet 


Stanford-Binet. In addition, a measure of 
social competency was related to the IQs 
from the two scales. It was determined that 
age, but not level of retardation, was signifi- 
cant in determining the magnitude of the 
difference between WAIS and S-B IQs. WAIS 
IQs averaged 15 and 23 points higher than 
S-B IQs for subjects 18-54 years and 55-73 
years, respectively. Regression equations were 
calculated to translate the IQ from one test to 
the other test. The WAIS and S-B IQs cor- 
related equally well with Social Ages from the 
Vineland Scale of Social Maturity. 
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PERCEPTUAL SIZE CONSTANCY IN CHRONIC 
SCHIZOPHRENIA* 


H. W. LEIBOWITZ 
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Veterans Administration Hospital, Tomah, Wisconsin 


The ability to judge correctly the sizes of 
objects despite variation in viewing distance, 
i.e., size constancy, represents an important 
biological achievement of living organisms 
and has been studied in relation to the under- 
lying mechanisms as well as to psychopathol- 
ogy. In particular, the possibility that size 
constancy differs among schizophrenics as 
compared with normals has been of interest 
to a number of investigators and theorists 
(Lovinger, 1956; Maes, 1957; Raush, 1952, 
1956; Reynolds, 1954; Sanders & Pacht, 
1952; Weckowicz, 1957). Bruner (1951) 
states the theoretical problem most succinctly 
when he suggests that a withdrawal from ob- 
ject relations and an increasing concern for 
the self might lead to a breakdown in per- 
ceptual constancy. However, as pointed out 
in a recent summary (Rabin & King, 1958), 
there is little agreement among the results of 
previous experiments. Sanders and Pacht 
(1952) found “‘overconstancy” among a schiz- 
ophrenic group as compared with normals, 
while Weckowicz (1957) reports the opposite 
relationship. Raush (1956) found no differ- 
ences between nonparanoid schizophrenics, al- 
though his paranoid group did differ from 
normals. 

The purpose of the present experiment is 
to re-examine this problem, utilizing a group 
of chronic, undifferentiated schizophrenics 
chosen from the patient population of a neu- 
ropsychiatric hospital to be most “withdrawn” 
with respect to their behavior. It is felt that 


1 Supported, in part, by Grant M-1090 from the 
National Institute of Mental Health, National Insti- 
tutes of Health, United States Public Health Service, 
and the Veterans Administration Research Program. 


the selection of the extreme cases from among 
a group of chronic patients whose diagnosis }§ 
more reliable provides an excellent test of the 
hypothesis under consideration. The metho 
of testing has been modified after techniques 
which have been employed in experimenta 
laboratories and for which a large body of 
normative data is already available (Holway 
& Boring, 1941; Leibowitz, in press; Leibo- 
witz, Chinetti, & Sidowski, 1956). Essentially: 
the technique requires the subject to sigra 
which of two sticks is larger, a simple task 
which has been used successfully with chil- 
dren and feebleminded groups. Data are 0 F 
tained over a wide range of viewing distance 
thus providing the basis for a more complet? 


analysis of the size matching-distance rela- 
tionship. i 


À METHOD 
Subjects 


The members of the experimental group wer? a 
male patients of the Veterans Administration Hose 
pital in Tomah, Wisconsin, carrying a diagnosis “| 
chronic, undifferentiated schizophrenia, There WS i 
history of brain damage, organic pathology, oF eet 
defect. None of the patients were receiving £ of 
although many of them were on different tyP¢ ric 
drug therapy. They were selected by the psychia 
ward team to be mainly characterized by withdra as 
Symptoms. The mean age of the 35 patients ve 
39.41 years with a range from 24 years to 56 yes 
The average length of psychiatric hospitalization Ys, 
8.84 years with a range of from 2 years to 13 yen i 

The members of the control group were 20 pay ea? 


atric aides and were selected at random. Their 
age was 38.52 5 


to 54 years, 


`. yeas? 
years with a range of from 25 > 


Experimental Conditions 


. i coh 
_ The experiment was conducted in a hospital is 
ridor 94 feet wide, 93 feet high, and 135 f° 
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cat Large windows were located every 12 feet in 
ae of the corridor and all tests were made in 
ud t The test objects consisted of five wooden 
and = cut from 1 inch stock to lengths of 2, 4, 8, 16, 
A tne inches, ‘These were mounted in square bases 
The as by 4 inches and were painted “flat” black. 
tional ENBE distance of the test objects was propor- 
and E their size, the ranges being 10, 20, 40, 80, 
tended ae respectively. Thus, the visual angle sub- 
spondi a each object, and the size of the corre- 
(0.96 ae retinal image, was the same In each case 
end of TR of arc). The subject was seated at one 
objects the corridor and first shown one of the test 
he Sie up at its appropriate viewing distance. 
Stick a ject was told to look at the height of the 
Was oan to tell whether a comparison stick, which 
test Bee at right angles to the line of sight to the 
or tallee and $ feet from the subject, was shorter 
Were ah than the test object. The comparison objects 
to 35 fae from a series ranging in size from 1 inch 
size, Ee and were indistinguishable, except for 
the m the test objects. The order of presentation 
irst ee stimuli was chosen so that the 
“s ee barion would easily elicit a “higher or 
Second than” response from the subject, while the 
chosen ae his response. Subsequent sticks ms 
jective © be progressively nearer the point of sub- 
Compa cquality. The order of presentation of the 
rison objects was determined by 2 5X5 latin 


Squa; 7 i 
Ing ay design, No difficulty was encountered in test- 
diffe patients. They cooperated well, exhibited no 
: uctions nor in 


liy. in understanding the instr 


Maki 
Plaine goudsments, and showed no signs of com- 
Mate}, roughout the study, which lasted Sey 


Y 15 minutes for each patient. The two au 


Serve 
alternately as experimenters. 
Th RESULTS 
Whi h Mean sizes of the comparison objects 
ita the various 


$ aes Matched the test objects at 

along Ces employed are given for both groups 

-Fi With their standard deviations 17 Table 

ize Sure 1 represents a plot of mean matched 

thi as a function of test object distance. On 
dicti Te a horizontal line represents & pre- 
n in terms of the “law of visual angle, 


Rise TABLE 1 

HE 9 

Poxeno Size or a Serms or Test OB 
N or Diısrance FOR SCHIZOPHR 


JECTS AS A 
ENIC AND 


e Normar SUBJECTS 
Ta - — 
Objec Test Schiz oni Normal 
esc omen, SRE eve) 
Caches) Distance -— = 
(Feet) Mean SD Mean sS 
i 2 
Ag 2 23 213 022 
e A 20s Oia e EIE Mane 
3 20 eU oot aan i igs 
n 80 1 iva PROSE 
is 120 ABE pop) 2548 ales 


Hes 

L o NORMALS 

N o SCHIZOPHRENICS 
S20 

VY 

(0) 

NYS) 

(S) 

ke) 

o 10 

D 

$e 

> LAW OF THE VISUAL ANGLE 


70 20 40 80 720 


Distance of Test-objec (feet) 


Fic. 1. Matched size as a function of distance for 
a group of normals and of chronic schizophrenics. 
(The test objects subtended the same visual angle at 
all distances of observation.) 


a condition which would obtain if judged size 
depended only on retinal image size. In the 
present study, retinal image size was constant 
so that matched size, according to this predic- 
“tion, would be the same for all test objects 
and viewing distances. The other theoretical 
extreme, the “law of size constancy,” is repre- 
sented by the diagonal line. According to this 
prediction, perceived size is independent of 
distance so that matched size would be equal 
to the actual size of the test objects. As is 
typical in such experiments conducted with 
pinocular vision, adequate illuminance, and a 
well articulated visual field, the data for nor- 
mals lie very close to the prediction in terms 
of size constancy. The data for the schizo- 
phrenics also lie close to this theoretical line. 

The nature of the data indicated the use of 
a nonparametric test. The Mann-Whitney U 
test was applied (Siegel, 1956). From this 
analysis it was concluded that size judgments 
of the schizophrenic group do not differ sig- 
nificantly from the normal sample on this 
task (U= 119, P< 424). In addition, re- 
spective variabilities of the two groups were 
compared. As expected, none of the differ- 


ences reached significance. 
DISCUSSION 


of the present study indicate 
between the ability of chronic 
d normals to judge correctly 


The results 


no difference 
schizophrenics ani 
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the sizes of test objects from 10 to 120 feet. 
Both groups exhibit a high degree of size con- 
stancy. These data are in agreement with 
those of Raush (1952), who found essentially 
the same results with nonparanoid schizo- 
phrenics. The disagreement of the present 
data, as well as those of Raush, with other 
investigators could well be a result of differ- 
ences in the patient population (e.g., Sanders 
& Pacht, 1952, tested outpatients) or in the 
procedure employed. In any event, it is im- 
portant to recognize that even in those studies 
which do differentiate between normals and 
schizophrenics the magnitude of the differ- 
ences are small in relation to differences found 
as a result of variation of less subtle variables, 
such as the instructions given the subjects 
(Gilinsky, 1955; Holaday, 1933), or the age 
of the subjects (Beryl, 1926; Zeigler & Lei- 
bowitz, 1957). 

The present data have implications in rela- 
tion to psychopathology, as well as to a 
theoretical understanding of size constancy._ 
With respect to the chronic schizophrenic 
group tested, it is clear that whatever dis- 
orders of perception, thinking, or behavior 
they suffer from, their size constancy is un- 
affected. Indeed, it would be difficult to imag- 
ine such patients, who move about the hos- 
pital grounds and engage in sports, judging 
incorrectly the sizes of environmental objects. 
The difficulty with schizophrenics, as seen 
clinically, may not be in relation to neutral 
objects, such as the sticks utilized in this 
study, but to other individuals and to their 
own patterns of thinking and emotions. Ap- 
parently, the withdrawal characteristic of 
schizophrenia, in general, and of the subjects 
utilized in this study, in particular, may be 
significant in interpersonal relationships only, 
and are not indiscriminately employed as 
defenses in purely nonaffective areas, such 
as size judgment. As suggested by Bruner 
(1951), one would be more likely to find 
differences between schizophrenics and nor- 
mals if the “test objects” were human beings. 

The present size matching data are also of 
importance in relation to the mechanisms 
which subserve the size constancy effect. It 
has been demonstrated that the size con- 
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stancy of children for near objects is good, 
but that the ability to judge correctly the 
sizes of distant objects develops slowly (Zeig- 
ler & Leibowitz, 1957). These data suggest 
that there are a number of separate mecha- 
nisms involved in size constancy, the neare! 
distances being mediated by the kinesthetic 
cues of accommodation and convergence, the 
farther by a more slowly developing percep- 
tual learning process (Leibowitz & Moore: 
1960). The available data indicate that ths 
learning process is complete by the early 
teens or, in any event, since the patients 
the present study were veterans, before the 
age at which the subjects tested were hos 
pitalized. The lack of any difference betwee 
the groups in the present experiment sugges i 
that the mechanisms underlying size oa 
stancy for neutral objects, such as used 1 
this study, are independent of personality 
changes even as severe as those encounter?” 
in schizophrenia. This conclusion is in agter 
ment with similar studies which demonstrat? 
that the development of size constancy iS E 
dependent of intellectual processes as In f 
cated by the lack of differences betwen 
feebleminded and normal subjects on siz 
matching tasks (Jenkin & Morse, in pre 
Leibowitz, in press), Thus, it would apPe 
that the development of size constancy: a 
though closely related to the age of the SU 4 


ject, is independent of mental and personali” 
development. 


SUMMARY 


m i 
_ The ability to judge object size as a on: 
tion of distance was determined for a ae 
composed of chronic, undifferentiated sch” 


A ie 
Phrenics, as well as a control group of psy 
atric aides, e 


There were no significant differences 1” K 
matches produced by the two groups- et 
judged correctly the sizes of the test oPJ° 
at all distances. di” 

It is suggested that the absence of any inb 
ferences is due to the fact that size mat” pd 
requires abilities which are fully devel 
Prior to the onset of schizophrenia and jt” 
are unaffected by the characteristic 
drawal observed in this pathology. 
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AND PERFORMANCE ON THE EDWARDS 


PERSONAL PREFERENCE SCHEDULE 
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Since its publication in 1954, the Edwards 
Personal Preference Schedule (EPPS) has 
been the focus of considerable research bear- 
ing upon the problem of social desirability as 
a source of response variance. The EPPS was 
constructed so that each item consists of a 
pair of statements matched for social desir- 
ability, thus presumably forcing the test taker 
to respond in accord with his actual behav- 
ioral characteristics rather than the social 
desirability of the statements. To the extent 
that Edwards (1959b) is correct in consider- 
ing social desirability as contributing prin- 
cipally to measurement error, studies of the 
relationship between social desirability and 
EPPS performance are important in an over- 
all assessment of the test’s validity. However, 
the many investigations have provided rather 
inconsistent results and conclusions, 

Several studies have concluded that social 
desirability has been eliminated as an im- 
portant determinant of performance on the 
EPPS. Navran and Stauffacher (1954) found 
a near-zero correlation between social desir- 
ability ratings of the traits being measured 
and scores on the EPPS scales. Borislow 
(1958) reported that subjects were able to 
change their test profiles under both social 
and personal desirability faking sets but con- 
cluded that desirability had been eliminated 
as a source of conscious response manipula- 
tion on the EPPS because no consistent scale 
patterns among the subjects were obtained 
under either set. Finding only a “slightly 
greater than chance” number of significant 
point biserial correlations between the sub- 
jects’ Social Desirability scale scores on the 


MMPI and frequency of choice of the first % 
second statement in each item pair 0 hat 
EPPS led Kelleher (1958) to conclude t 
social desirability plays an insignificant ro 
in EPPS performance. v 
In contrast to these negative findings, s 
eral investigators have reported results WHIO? 
indicate that desirability remains an imP 5. 
tant source of response variance on the EP $, 
Heilbrun (1958) found a .60 correlation © 
tween EPPS scale scores and personal ce at 
ability of the test scales. Corah, Feldm 
Cohen, Gruen, Meadow, and Ringwall (19 fy 
obtained a .88 correlation between the P t 
centage of college subjects judging the ir 
statement in an item as more socially ees 
able and the Percentage of similar subjec? y 
an independent group who endorsed 4 
statement as self-characteristic. Edwar 
Wright, and Lunneborg (1959), pointing °; 
that Corah et al. used only 30 EPPS it? 
taken out of context, have repeated the P ott 
cedure using all items in the test. They TP ry 
somewhat lower correlations (.69 and 61 
two samples, but the relationship betwee? 
sirability and endorsement still remained 
One common characteristic of all 0 av 
Studies described above is that they es 
Sought to evaluate the relationship bet ipg 
desirability and EPPS performance ve 
some type of group desirability statistic ( the 
group average or group percentage) Me pip 
analysis. It is possible that if the relation’ iy 
between each Subject’s individual desira? pe 
ratings of the statement alternatives Í” er? 
EPPS and his actual statement selections * jd 
determined, a more accurate inference 
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Social Desirability and EPPS 


be made regarding the influence of social de- 
iy The logical expectation would be 

at if social desirability is related to EPPS 
ae means, this should be more evident 
flies an individualized approach, since the 
rien desirability values which each sub- 
een Ssigns to test statements should be more 
m in predicting his statement endorse- 
ba than would averaged group values. A 
Mot study by Taylor (1959), however, 
mined the opposite to be the case. He deter- 
ual a the correlations between both individ- 
elon group social desirability values and 
ome of MMPI items for schizo- 
aged ic patients and found the group-aver- 
= 7 values correlated considerably higher (r 
Vid 9) with performance than did the indi- 

Ual values (r = .36). 
to e present study had three purposes: (a) 
anq o individual social desirability values 
to Whi PS performance to clarify the extent 
tribute individual social desirability set con- 
the <= to response selection; (2) to compare 
vidus ction of EPPS responses from indi- 
tion Social desirability values with pc 
desir, qi these responses from gonn 
va te ility values; and (c) if individua 
Ep eS are found to be related to overall 
relati performance, to evaluate the specific 
anq nship between individual desirability 

ĉach EPPS scale. 


Subi METHOD 
Obtained“: The 58 subjects used in this asc a 
£0 rom duate psy¢ 
a large undergra this sample 


Ours, 
incluga the State University of Iowa. 
N 29 males and 29 females. 

tation, ete EPPS (Edwards, 1959a) is an 
A derived, multivariate, personality, 
includes 135 different statements : 
Personality characteristics, 9 for each 0 the 15 
measured by the test, These statements At 
teme; in 210 pairs with each pair include i 
tring nts matched for social desirability but mea 
r eac trait 


u . 
sents cout traits, The final score fo 


objective, 
question- 
“Don, bearing 
traits 
a; tran, 


= ossible 
` the number of occasions out of a p 
y 3 
mene Thich the subject has selected that trait za 
Cnt, S More characteristic than the paired $i 


+ roc, ft 
Ministere, All subjects were initially group ad 
bout Sd the EPPS under standard instructions. 
C 135 dents later the same subjects were een 
need È fferent statements included in the test x 
hie oi Tate each as a trait in other people o 3 
'ehiy Zt scale from highly socially undesirable 
Socially desirable, This rating procedure wee 
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identical to that described by Edwards (1957) in his 
jnitial derivation of statement desirability values. 
Having obtained the subject’s EPPS responses under 
standard conditions and individual desirability rat- 
ings from the nine-point scale for both statements in 
each item pair, it was possible to determine the per- 
centage of times the subject endorsed as self-char- 
acteristic the statement having the higher individual 
social desirability value jor kim, Items for which the 
subject had ascribed equal individual social desir- 
ability values to both statements were omitted from 
the analysis. 

The second step was to determine the percentage 
of times each subject endorsed as self-characteristic 
the more highly socially desirable statement in each 
item pair where the desirability values were those 
assigned to the statements by Edwards in his initial 
construction of the test. This percentage reflects the 
correspondence between EPPS response and group 
social desirability values obtained from an inde- 
pendent group of college students. There are 204 of 
the 210 EPPS items for which the group values of 
the paired statements differ. 

Finally, the values assigned to the nine statements 
measuring each of the 15 EPPS variables were av- 
eraged for each subject, and these mean values de- 
fined the subject’s individual social desirability rat- 
ings for the 15 traits. High and low social desirability 
groups were then constituted independently for each 
trait and the mean scale scores for these high and 
low groups were compared to assess whether the re- 
lationship between individual desirability and EPPS 
performance varied over scales. Since there are estab- 
lished sex differences on the EPPS scales (Edwards, 
1959a), precautions were taken to insure approx- 
imately equal numbers of males and females in both 
desirability groups for each scale. This was accom- 
plished by distributing individual social desirability 
scores for each scale separately by sex and defining 
high and low groups by cutting at or near the 
median for both sex distributions. The two high and 
two low desirability groups for each scale were then 
recombined into one high and one low group with 
nearly equal numbers of males and females. 


RESULTS 


Individual social desirability values and 
erformance. There was a mean of 

oe aie out of a possible 210 in which 
the paired statements had discrepant individ- 
ual social desirability values for the 58 sub- 
jects in this study. For those items within 
which the statement values varied, the sub- 
jects averaged endorsing the individually more 
desirable statement as more self-characteristic 
67.16% of the time. This percentage value 
differs significantly from a 50% chance ex- 
ectancy (t= 10.34 for 57 df, p < .001) and 
clearly indicates that performance on the 
EPPS is related to the individual ratings of 
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TABLE 1 


COMPARISON BETWEEN EPPS Scare Scores ror THE 15 SEPARATE HIGH AND 
Low Socrat DESIRABILITY Groups 


High Social Desirability 


y 


Low Social Desirabi lity 


EPPS Scale Mean SD N Mean A N i t " 
Achievement 15.69 5.55 29 13.52 4.76 29 1.58 
Deference 11.62 4.00 29 10.90 3.04 29 Ai 
Order 11.86 3.95 29 8.83 3.58 29 3.03** 
Exhibition 15.93 2.27 29 14.14 4.58 29 <b 
Autonomy 15.33 4.04 30 11.89 3.84 28 1.46 
Affiliation 17:17 3.25 29 14.72 3.87 29 2.58** 
Intraception 20.47 4.68 30 15.71 5.32 28 361"* 
Succorance 11.20 4.69 30 10.71 3.68 28 44 
Dominance 16.86 3.60 29 12.97 4.82 29 34a" 
Abasement 12.63 4.72 30 12.46 5.36 28 13: 
Nurturance 16.97 4.75 31 12.93 4.61 27 3.26** 
Change 17.39 4.65 28 15.43 5.14 30 1.51 
Endurance 14.26 4.16 27 11.23 4.31 31 2a 
Heterosexuality 18.24 4.94 34 15.00 5.01 24 2.36* 
Aggression 14.36 4.62 28 9.50 4.47 30 4,05** y 

* One-tailed £ tests were used. 
b The Cochran-Cox approximation was used because of heterogeneity of variance, 


ficant at .05 level, 
Significant at .01 level. 


statement social desirability. For the 58 sub- 
jects the range of endorsement percentage in 
the individually socially desirable direction 
was from 35.03 to 79.66, the subject provid- 
ing the lower end of the range being unique 
in that he was the only subject to fall below 
the 50% mark. 

Individual vs, group social desirability 
values. When the group mean social desir- 
ability values of the statement alternatives 
were considered, it was found that the sub- 
jects endorsed the group-defined higher so- 
cially desirable statement on 55.80% of the 
204 imperfectly matched items. This result is 
very similar to that reported by Goodstein 
and Heilbrun (1959) who found, for an inde- 
pendent sample of 248 college subjects, that 
the group-defined more socially desirable 
statement was endorsed 55.92% of the time 
on the average. The 55.80% value found 
in the present study differs significantly (¢ 
= 8.99 for 57 df, p< 001) from a chance 
expectancy of 50%, clearly indicating that 
performance on the EPPS is also related to 
group-defined desirability of the statements. 

A comparison between the predictableness 
of EPPS statement endorsement from group 
social desirability values and from individual 


social desirability values was made, and i 
was found that the difference between ic! 
mean percentage of 67.16 for items in WPS, 
the endorsed statement was the more ve 
vidually desirable was significantly hig a 
than the 55.80% of items in which the ae 
dorsed statement was group-defined more 
sirable (t = 9.86 for 56 df, p < .001). al 
Individual social desirability and sep?" jp 
EPPS scales. Although it had been shown jr 
this study that the individual’s judged “jot 
ability of the Statement alternatives t 
show a significant relationship to staten i 
endorsement on the EPPS, the questio” pi? 
remained as to the extent of this relation yj 
for the separate scales of the test. Data Pal 
nent to this question are presented in T3 vi” 
which gives the means and standard 4ê the 
tions on each of the 15 EPPS scales for.) 
15 separate high and low social desira? of 
groups and the results of ¢ test compat”? gt h 
It can be seen that the group of subjects ir 
ing the trait statements as more highly © jel 
able had the higher mean on each scale sat 
compared to the group rating the trait eile 
ments as less desirable, the differences af 
significant for 9 of the 15 comparisons so 
approaching significance at less than t” 


Social Desirability and EPPS 


level on 3 more. Thus the positive relation- 
ship between individual desirability and en- 
dorsement appears to hold for most but not 
for all of the EPPS scales. 


DISCUSSION 


o major finding in the present study is 
tiie te? subjects endorsed the individ- 
as oinp, more. socially desirable response 
‘ae $ -characteristic on at least two out of 
abilit Vane items in which the social desir- 
bat values of the paired statements dif- 
that This would seem to clearly indicate 
iua desirability has not been elim- 
varia as an important source of performance 
by fatia on the EPPS as has been suggested 
1958; K. previous investigators (Borislow, 
l 54) elleher, 1958: Navran & Stauffacher, 
the - The inconsistencies in results among 
ine aa investigators bearing upon this 
Most likely has stemmed from the use 
Stoup social desirability statistics which, 
en the results of the present study, are 
espona, to be related to the individual’s 
ales f endorsement than the desirability 
or that particular individual. 
We major interest in this study was 
ture, er an individual’s responses to a ne 
highly Personality questionnaire are Pe 
ability related to his own judged social desi : 
desirabins the responses or to averaged mes 
indin ility values. Taylor's (1959) n 
related that group values were more hig ly 
logical appeared to be inconsistent with i : 
Daram assumption that the closer a ml 
Nore eters of behavior are approximated, a 
the p ectively can behavior be predicted or 
Stug vidual. The results of the presen 
Tayio Were not in agreement with those x 
jects T, since it was found that college sub- 
Epp. Selected the statement alternative On ue 
Morg A ae was individually more en 
Subject, an 67% of the time while these S ie 
S endorsed the group-defined more 


Sir. 

lten Statement on less than 56% of m 
tionsp; Cêlor suggested that the lesser pa 
desi. D obtained between individual socia 
tha; Ability values and MMPI performance 


Cour tween group values and performan = 
lower e attributed in large measure to Ta 
in tury ability of individual values which, 

> Would serve to more seriously atten- 


bas 
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uate this correlation. If this is true, it should 
be pointed out that the opposite results were 
found in the present study despite the lower 
reliabilities of the individual social desirabil- 
ity values. Since there were so many proce- 
dural differences between the Taylor study 
and the present one (e.g., nature of test item 
content, true-false vs. forced choice format of 
the tests, nature of the subject population, 
etc.,) further speculation regarding the differ- 
ential results appears unwarranted at the 
present time. 

The analysis of relationships between indi- 
vidual desirability and specific scores suggests 
a fairly general positive relationship over 
scales. An interesting finding was that the 
three scales which appear to be unrelated to 
social desirability (Deference, Succorance, 
and Abasement) have a common psycholog- 
ical element: their characteristic behaviors 
involve some type of subordination of the 
individual to another person. This suggests 
the hypothesis that the young adult college 
subjects used in this study, being in transition 
from a period of subordinate childhood rela- 
tionships into a period of more superordinate 
or coequal adult relationships, have not as yet 
stabilized their value systems regarding sub- 
ordinate, equal, and superordinate interper- 
sonal roles. A consequence of this would be 
that what the subject perceived as socially 
desirable and self-characteristic relative to 


role-related behaviors would be at least tem- 


porarily unrelated. 


SUMMARY 


This study investigated three issues relative 
to performance on the Edwards Personal 
Preference Schedule: (a) the relationship be- 
tween the social desirability of statement 
alternatives on the questionnaire and the 
endorsement of statements as seli-character- 
istic when the individual's own desirability 
values are used as predictors, (b) the predic- 
tion of statement endorsement using the in- 
dividual’s social desirability values for the 
statements as opposed to prediction using 
group-defined values for the statements, (c) 
the differential relationships between the in- 
dividual’s social desirability values and per- 


formances on the separate scales of the EPPS. 
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The results indicated that the individual's 
social desirability set is an important source 
of variance in EPPS performance, since col- 
lege subjects endorsed as self-characteristic 
the more highly valued statement at least two 
out of every three occasions when the state- 
ment alternatives were assigned different in- 
dividual social desirability values. It was also 
found that individual social desirability values 
were more highly related to EPPS perform- 
ance than were group desirability values, a 
finding which differs from that reported by 
Taylor (1959) who used the MMPI as the 
performance variable. Finally, analysis by 
separate scales suggested that individual so- 
cial desirability set is related to most but not 
all of the EPPS variables. 
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remaining variables was determined exclusively by 


Before the etiology and treatment of chil- 
the frequency with which they had occurred, and 


; 
mn behavior disorders can be sensibly ex- 
ia T the disorders themselves must be de- 
tive eff or the sake of generality and descrip- 
Sinise, , any concepts employed in such 
and tion should be nonarbitrary, unitary, 
ave 5 dependent. Factor analytic methods 
e Baisan employed with salutary effect in 
a uctural definition of adult disorders 
oot, Jenkins, & O'Connor, 1955; Rub- 
a kon, 190i Wittenborn, 1951; 
oera & Holzberg, 1951), but similar 
with the disorders of childhood has only 


Eg 
Sun (Hewitt & Jenkins, 1946; Himmelweit, 


fee The present study extends and refines 
=e g uniformly 


Sat ‘om research by factorizing un 7 
ing th judgments of problem behavior u 
Years, kindergarten and elementary schoo! 

S, and by examining changes in problem 


*pressj : > 
sion during that time. 


SUBJECTS AND PROCEDURES 


I 
ne the absence of any accepted theory of struc- 
a Sanization among children’s behavior da 
Meang Ple of problems was chosen by empirica 
Chose, he referral problems of 427 representatively 
tre ears at a guidance clinic were recorded, and 
More eS tabulated for all problems mentione! 
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the 58 most common problems were selected for gen- 
eral investigation. 

In use, the variables were ordered randomly, as- 
sembled in a format requiring ratings of O (no prob- 
lem), 1 (mild problem), or 2 (severe problem), and 
submitted for completion to 28 teachers of 831 kin- 
dergarten and elementary school children in six dif- 
ferent schools in Illinois. The choice of school chil- 
dren, rather than clients undergoing treatment for 
judged disorders, Was based on the assumption that 
most such disorders are extremes of continuous “nor- 
mal” dimensions, and was determined by the desir- 
ability of obtaining uniform data on large numbers 
of subjects within the age range under consideration. 
The large sample requirement has been met previ- 
ously (Hewitt & Jenkins, 1946; Himmelweit, 1953) 
by recourse to case history information, but the dan- 
gers of that expedient seemed greater than those in 
the present course, and the study was begun in the 
hope that otherwise unselected school children would 
present sufficiently numerous, severe problems to war- 
rant sensible analysis and yield meaningful results. 
Distributions of ratings were generally eccentric, but 
the effects were reduced by excluding some rarely 
blems (dizziness, soiling, and enuresis, 
which occurred jn less than 3% of the cases, were 
eliminated), and by pooling judgments of mild and 
severe problems (ratings of 1 and 2) for all the re- 
maining variables. 

For analysis, the sample was divided into four 
groups: & kindergarten sample (N = 126), a first and 
second grade sample (N = 237), a group from the 
third and fourth grades (N = 229), and a fifth and 
sixth grade sample (N = 239). Two teacher ratings 
were available for each kindergarten child; the num- 
ber of actual ratings used in the analysis is thus 
double the N given above for the kindergarten group. 
Phi coefficients of intercorrelation were computed 
separately for the four samples. From each correla- 
tion matrix, 10 centroid factors were extracted, and 
from each set of centroid factors 2 were rotated to 
conform with Kaiser’s varimax criterion (Kaiser, 
1958)- $ z 

Human judgment was involved only once in all 
that analysis—in deciding how many factors to re- 
tain for rotation. The decision to keep only two was 
based on inspection of plots of variance removed by 
successive centroid factors, and the application of 
criteria for factor retention developed elsewhere 
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(Peterson, 1960). Out of personal curiosity, five- 
factor solutions were also tried for each data set; 
but these, as expected, were much less stable over 
age than the two-factor solutions, and only the latter 
will be reported. 

Factor scores were computed for all cases by un- 
weighted summation of pertinent problems checked 
by the teachers. Interjudge correlations were com- 
puted for the kindergarten sample, and further at- 
tention directed toward comparing the four age 
groups. Data for boys and girls were separated in 
all comparisons, because of well known sex differ- 
ences in problem expression, and mean factor scores 
were computed to show trends in the development 


of behavior problems over the years of middle child- 
hood. 


RESULTS 
The Factors 


All four sets of rotated factor loadings are 
presented together in Table 1, an arrange- 
ment permitted only by the marked similarity 
between results at the four age levels.? Fac- 
tor 1 is obviously a conduct problem dimen- 
sion, closely resembling the like-named factor 
isolated by Himmelweit (1953) and. “unso- 
cialized aggression” as defined by Hewitt and 
Jenkins (1946). Factor 2 has been labeled 
personality problem in accordance with Him- 
melweit’s designation and common usage. It 
is much like the “over-inhibited behavior” 
dimension which Hewitt and Jenkins found. 
Actually these terms, “personality problem” 
and “conduct problem,” are grossly inappro- 
priate. Both problems are personality expres- 
sions, and both affect conduct. But the cen- 
tral meanings seem clear enough. In one case, 
impulses are expressed and society suffers; in 
the other case impulses are evidently inhibited 
and the child suffers. 

The generality of these factors appears to 
be enormous. Not only do they emerge with 
striking uniformity over the limited age range 
and the particular variables and subjects ex- 
amined here; they have appeared in very 
much the same form with the recorded prob- 


? The rating schedule, correlation matrices, and un- 
rotated centroid factor matrices have been deposited 
with the American Documentation Institute. Order 
Document No. 6632 from ADI Auxiliary Publications 
Project, Photoduplication Service, Library of Con- 
gress; Washington 25, D. C., remitting in advance 
$2.00 for microfilm or $3.75 for photocopies. Make 
checks payable to: Chief, Photoduplication Service, 
Library of Congress. 
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lems of treatment cases (Hewitt & Jenkins, 
1946; Himmelweit, 1953), and remarkably 
similar factors have appeared in the question 
naire behavior of delinquent boys (Pererin 
Quay, & Cameron, 1959). Considering @ 
studies together, age has varied from early 
childhood to adolescence; problem status has 
varied from none, through clinic attendanc' 
to incarceration for delinquency; data souri 
have varied from case history records, “ 
standard ratings, to questionnaire responses, 
methods of factor extraction have varied fio 
cluster inspection to centroid analysis; roi 
tional methods have varied from none, thrones 
visual shifts to both orthogonal and obliat 
solutions, to analytic techniques. Through | 
all, the factors have stayed the same, The 
their definition at last seems adequate. z 
time is ripe for study, particularly experimt 
tal study, of dynamics, etiology, and tre 

ment. 


Factor Scores and Their Reliability 


Such investigations, however, cannot Pd 
ceed until various properties of the measur as 
devices have been examined, Factor 1 
were computed by unweighted summat”, 
over the first 15 variables for each facto" of 
listed in Table 1. Below that point, man g 
the variables either have no appreciable o ly 
ing on either dimension, or approxima jo! 
equal loadings on both. The former condi” | 
holds especially for skin allergy, hay we 
nausea, and stomach-aches, which may acl 
be purely somatic and qualitatively dist es 
from the other variables examined. The le 
condition, roughly equal loadings on both cet” 
tors, holds for crying, nervousness, a? ate 
tain school attitudes, variables which * 
either very general in nature or exhibit $° 
kind of developmental change. er! 

Reliability and interfactor correlation t 
examined for the kin 
since only for that group were dual eon 
available. Interjudge rs of .77 and .75 pes 
found for Factors 1 and 2, respectively- aoe 
figures are exceptionally good for ratings» su 
are sufficiently high for most research yi? 
Poses. The correlation between factors ent? 
-18, low enough to meet most require” 
for independence. 


el 


Behavior Problems of Middle Childhood 


TABLE 1 
= ROTATED Factor LOADINGS 
Conduct Problem Personality Problem 
Factor K” 1-2 3+ 5-6 Ke 1-2 344 5-6 
Conduct Problem 
Disobedience 74 77 69 86 03 04 07 11 
y stuptiveness 73 67 66 76 —04 19 —03 11 
piao terousness 68 63 67 68 —16 07 —07 —09 
n Shting 54 73 61 777 —0 — 04 11 07 
“\ltention-seeking 54 67 63 76 -12 10 —O7 02 
Restlessness ~ 64 58 62 71 04 24 06 20 
Negativism 56 64 60 70 12 27 20 15 
Mpertinence 57 57 53 76 02 —08 00 08 
Tritability 53 59 57 69 y 
emper ta 5 37 49 64 08 11 22 16 
yperactivity. ae =H 49 54 49 —06 12 00 03 
Profanity 30 42 64 60  —07 11 02 00 
{ln TEELT: 
se Perativeness a w & a 8 2 2 
re auity ao s 9 5< 2 8 4# D 
"esponsibility 60 65 49 65 a 
ntent a aA 61 36 69 39 30 57 28 
aina eness vi 50 36 37 29 36 55 31 
Shareen school 54 31 60 37 34 55 29 
Dislike S Of attention span 8 p 2 4 06 2% H 13 
PH for school ae 25 46 50 40 44 22 26 
Thumb suchi . t 8 § 2 ita 
Spe Sucki = 2 5 a 
Skin allergy "g i6 o2 —20 —05 01 5 
Perso i 
nality 
Teit Problem m 13 17 39 56 66 62 
“ach 85 Of inferiority 12 ce B 16 60 61 60 58 
Sodan self-confidence 12 08 04 05 50 64 61 60 
Prone Withdrawal —03 a8 15 24 54 59 60 58 
Sel neness to become flustered v _03 —15 16 55 60 47 63 
Sh nsciousness -e -18 —23 “i a 37 38 : 
La osiy 01 19 a 31 52 47 61 43 
a ärgy ó 2 % o #9 8s 3 48 
e lity to have fun -15 06 “os 29 47 43 64 42 
Pression 00 20 oa 14 45 43 64 41 
icticence 06 20 18 30 40 53 54 46 
3S Persensitivity 06 26 = 29 39 48 45 At 
Howes y 09 be 
e 1o Wsiness > 02 09 ot 05 51 32 50 31 
a ofness a6 -8 z 47 57 64 4 
peoceupation 09 ” 21 51 40 ca H oe 
atCki ot interestin envi 24 30 Z 7 43 54 34 36 
Cly te; terest in environment vi 21 36 s = 
D, Msinesg 16 21 49 53 46 69 47 
poe dreamin 14 26 39 39 41 62 27 41 
gasin S 21 31 a 52. 4 42 48 30 
e 8estibility 04 29 a 59 27 48 32 19 
BE 15 a 35 14 08 45 37 32 
g “ference à pr 28 pt —04 24 47 20 20 
Shecifie eee younger playmates 09 24 —02 én 27 35 29 16 
Hettering 11 17 os 00 07 46 22 27 
Ntdaches 19 21 m = ü 14 38 37 
Tse —10 23 å j ŽŽ 00 20 39 35 
Sto DEY f 27 07 = 06 —0l 30 38 29 
p™ach-achee na 10 18 oe 16 ot 38 16 ot 
Ma rence for older playmates —14 a 26 04 =a H0 of 17 
 eSturbation Pa 08 yf et oo x o %3 
5 = 
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Fic. 1. Mean conduct problem scores, 


Developmental Changes 


Mean factor scores were computed for boys 
and girls in all age groups, and the results are 
shown in Figures 1 and 2. Throughout mid- 
dle childhood, boys consistently display more 
severe conduct disturbances than girls, pos- 
sibly as a function of constitutional differ- 
ences, but more likely in response to different 
levels of social expectancy and tolerance for 
misbehavior. An interesting reversal, however, 
occurs in the expression of personality prob- 
lems. Boys evidently start school with more 
personality problems than girls, but around 
the seventh or eighth year such problems be- 
come more plentiful among girls. Again, so- 
cial pressures for sex-type conformity seem 
the likeliest causal agents. Reasons for the ap- 
parent upswing in problems at the fifth and 
sixth grade level are obscure, The increase 
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may arise from the early agitation of adol 
cence, and the difficulty this can bring abou 
in our society. 


SUMMARY 


This study was designed to improve struc- 
tural definition of children’s behavior prob- 
lems and to examine changes in those prob- 
lems over the years of middle childhood: 
Teacher ratings of 58 clinically frequent prob- 
lems were obtained for 831 kindergarten a” 
elementary school children, and four separate 
factor analyses were conducted, one for ee 
kindergarten subjects and one each for cht 
dren in grades 1-2, 3-4, and 5-6. Two iag 
tors emerged with remarkable invariance A 
all four analyses. The first implied a tendent? 
to express impulses against society, and ‘a 
labelled “conduct problem.” The second c0” 
tained a variety of elements suggesting 1o 
self-esteem, social withdrawal, and dyspho, 
mood. It was called “personality problem 
Both factors have now appeared in a num 
of studies despite wide differences in subject” 
variables, and anlaytic procedures. 5 

Comparisons over age showed that boy! 
displayed more severe conduct problems t f 
girls at all age levels examined. Kindergat™’ 
and primary school boys also showed Tea 
Severe personality problems than girls, 4 
at the two highest age levels this trend W 
reversed, and girls displayed more persona” 
problems than boys. 8 

The definition of both dimensions se@ 
adequate. Reliable, independent measure _ 
the factors can be obtained, and the way 4 
ward investigation of dynamics, etiology: ® 


treatment now seems clear. 
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ATTRIBUTION OF TRAITS AND EMOTIONAL HEALTH 


AS FACTORS ASSOCIATED WITH THE PREDICTION 


OF PERSONALITY CHARACTERISTICS 


OF OTHERS 


MARVIN SPANNER * 


University of California, Berkeley 


Many unsystematized explanations have 
been offered concerning the manner in which 
judges predict the personality characteristics 
of others. The present paper will focus on 
two theories currently popular. The first, 
based on an interpersonj] theory of person- 
ality, has been most extensively represented 
by Harry Stack Sullivan. Sullivan’s (1947) 
ideas on the subject are reflected in his 
famous dictum: 


iG 
ef is not that as ye shall judge so shall ye be 
judged, but as you judge yourself, so shall you judze 


others; strange but true so far as I know, and with 
no exception,” 


The above quotation suggests that inter- 
personal prediction is based primarily on an 
attributive mechanism in which the state of 
the individual's self-concept determines the 
quality of the interpersonal appraisal. One 


; 1 This paper is based in part on a thesis submitted 
in partial fulfillment of the requirements for the PhD 
in the Department of Psychology of the University 
of California, Berkeley. The writer wishes to express 
his appreciation and indebtedness to Harrison G. 
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Research; and Victor B. Cline. 
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Air Force Base, Alabama. Permission is granted for 
reproduction, translation, publication, use, and dis- 
posal, in whole and in part, by or for the United 
States Government. Personal views or opinions ex- 
pressed or implied in this publication are not to be 
construed as necessarily carrying the official sanction 
of the Department of the Air Force or of the Air Re- 
search and Development Command. 

2 Now at Neuropsychiatric Institute, University of 
California, Los Angeles. 


can deduce from his statement the following 
hypothesis : 


I. There is a positive relationship between 
accuracy in predicting the personality charac- 
teristics of others and similarity of person- 
ality characteristics of the judge and the 
individual being judged. This would logically 
appear to follow because an understanding 
of others is based, according to Sullivan, 0” 
one’s self. Thus, if one tends to perceive 
others as similar to one’s description of one’s 
self, then those actually similar will be pet 
ceived accurately. Lingren and Robinson 
(1953) make a similar point when they state; 
in criticizing Dymond’s study (1950), that 
“Conventional people get good scores on 
empathy tests because most of their partners 
(or referrants) in the test are also conver” 
tional.” 

The second group of explanations of 
dictive ability involves the mental health - 
the individual. Cline (1953) in an exhaustiv® 
review of the literature suggests that, on tos 
whole, a good judge of others is emotionally 
sound, has good interpersonal relations, $ 
happier, more popular and flexible. In ade r 
tion, most clinicians assume that an emotio? 
ally healthy individual is a better judge k 
personality because he has fewer problew 
which interfere with his understanding 
others. In the present instance, two measure 
have been utilized to evaluate the emotion” 
health of the subjects. The first is the cor 
relation between the judge’s evaluation 
himself and his evaluation of the sort > 
individual he would ideally like to be- i 
second is a composite rating by experienc? 
observers of the “soundness” of the jude" 


pre- 
ol 
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Prediction of Personality Characteristics 


We have, therefore, the following two hy- 


‘4 ae relating the “mental health” of the 
A Se and his ability to predict personality 
+ characteristics of others: 


~ 


A 


b Test 


i 


udg 
ludge w 
ea Socia S Would have an adequate samp 


| 


e There is a positive relationship be- 
ae the self-ideal-self correlation of the 
ge (called self-acceptance and used as a 


; Measure of mental health) and his ability to 


Predict personality characteristics of others. 
tween is here is a positive relationship be- 
Ban nS rated “soundness” of a judge (used 

Measure of mental health) and his ability 


to i 48 
Predict personality characteristics of 
Others, 3 


Si METHOD 
progi o both the tests of judging ability and the 
Boerne instruments were devised and fully re- 
ited 4 by Cline (1955), the following is an abbrevi- 
description of these procedures. 


mp, Zudsing Ability 
he basic procedure used was à sound movie 
Our fictitious employment interviews. The inter- 
ere male undergraduates. Three of them 
19 years old and single. The fourth was 33 
old, married, childless, and a veteran. Each 
ef ee judge was asked, after viewing each a 
tions Ace interviews, to make a series of pre o 
Object) Out the interviewee (hereafter called speal 
One of m three specially devised instruments, Onl 
The i hich was used in the present analysis. — i 
ctor, Sur filmed interviews, conducted by a wane 
ese — highly structured and fairly constant. 
cr tee chosen from a group of nine films in 
the Soci Obtain the most diverse personalities among 
W's quil Objects, Cline has stated that the inter- 
„tervie = divided into three phases: a standard 4 
interview, situation, a stress situation in which me 
A telax ver is highly critical of the interviewee, an! 


the į Xed after-interview abreaction session in en 
is- 


Years 
Obsery, 


CUsse, Crviewee's reactions to the interview are C 
Series | he intent was to get a dynamic record of 
“ough | subjects on a sound film, rich and varie 
ae in the use of verbal and visual cues 5° that 


le of each 
Obj ’ A . 
Ject’s social technique. 


Ted; 

diction Tath 
ae resent study 
ho observe! 

(PWC). 
responses 
100-wort 
d from à 
ugh. 


Prediction task used in the P 


è aes four social objects on a 
Ord eck-list, This list was derive 
Adjective Check-List compiled by GO 
ma H. G. Predicting success in gadur 
"versie (Progress report) Unpublished manuscrip , 
Ment, of California Institute of Personality AS- 
d Research, 1952. 
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The corrected split-half reliability for the PWC as 
a test of judging ability for a group of 100 college 
undergraduates was .83 + .06. 

The PWC was also used by both the judges and 
social objects to describe themselves. Comparisons 
were, therefore, readily made among the judges’ 
prediction of the social object, the social objects’ 
self-reports, and the judges’ self-reports. 

Two other prediction tasks were given to the 
judges, although they were not used in the present 
analysis. One of them, the Behavioral Postdiction 

Test, in which the judge was required to predict 

the social object’s behavior in real life, was discarded 

because it was discovered, when a simple analysis 
of variance was carried out, that the source of 
variation attributable to the judges was not signifi- 
cant. A third instrument, a multiple-choice sentence 
completion test, was also discarded as a test of 
judging ability, because Cline had found that the 

corrected split-half reliability for a group of 100 

college undergraduates was only 36 + .06. 


Judges 

One hundred Air Force Captains were tested at 
the Institute of Personality Assessment and Re- 
search, as part of a large scale study of officer 
effectiveness. They were tested in groups of 10 in 
a living-in situation similar to that employed by the 
OSS during World War II (Office of Strategic Serv- 
ices, 1948). Their average age was 33.6 with a range 
of 27 to 49. Ninety-five percent of the officers were 
married, with an average of two children. Fifty-six 
had attended college, seven had done some graduate 
work, and only three had not graduated from high 


school. 


Measures 
Accuracy. This variable is defined as the tetra- 
choric correlation between the social objects’ self- 
reports and the judges’ prediction of them on the 
PWC. Since there are four social objects, each one 
of the 100 judges has four accuracy scores. 
Similarity. This measure is defined as the tetra- 
choric correlation between the social objects’ self- 
reports and the judges’ self-reports on the PWC. 
Here too, each one of the 100 judges has a similarity 
score for each of the four social objects. 
Self-Acceptance (self-ideal-self correlation). Each 
judge completed the PWC twice, once describing 
himself as he was and then a second time describing 
himself as he would ideally like to be. The measure 
of association used was the phi coefficient. Tt is 
greater the degree of association 


assumed that the g tion 
ie more self-accepting and therefore the “healthier 


the individual. : 

Soundness. This variable was drawn from a pool 
of 30 traits which were used by 10 staff members 
in rating the judges (officers). The following pro- 
cedure was used. After the first four groups of 10 
officers had been run, 10 staff members of the 
Institute of Personality Assessment and Research 
rated each of the 40 officers in the 30 traits. A 
normal distribution was required of the raters, using 


TABLE 1 
CORRELATION COEFFICIENTS (PEARSONIAN) BETWEEN 
ACCURACY AND SIMILARITY, SELF-ACCEPTANCE, AND 
SounpyEss, FOR EACH OF THE Four 
SocIaL OBJECTS 


Accuracy in Predicting the 
Four Social Objects 


Measure I IL IIL IV 
Similarity 01 28* 00 00 
Self-Acceptance —.06 -415 06 —.03 
Soundness =10 —.13 13 —.09 

*p<.01, 


a five-point scale. The same procedure was followed 
in rating the remaining 60 judges, after they had 
all been assessed. A composite trait rating was 
derived from each assessee by combining the indi- 
vidual ratings of the 10 staff members. Each item 


of the trait pool was then assigned a scale value 
for each assessee. 


Soundness was defined as: 


Maturity in personal relations; self-insight and 
self-acceptance, as well as acceptance and under- 
Standing of others. Absence of serious emotional 
Problems. Stability of mood and manner. Good 
balance of social conformity and spontaneity. 

All of the above 
simply treated as sco 
analysis, 


measures, once obtained, are 
res, for purposes of statistical 


RESULTS 


The first hypothesis—that accuracy of 
prediction of personality traits of others is 
dependent on the degree of similarity of judge 
and social object—was tested by correlating 
similarity and accuracy for each of the social 
objects. As can be seen in Table 1, the cor- 
relation between similarity and accuracy for 
Social Object I is .01 and for Social Objects 
II and IV is .00. The correlation for Social 
Object II, however, is .28, significant at the 
:01 level of confidence. Since this is the only 
one of the four social objects where signifi- 
cance was achieved, the first hypothesis, that 
there is a relationship between accuracy of 
prediction and similarity of judge and social 
object, was not substantiated. 

The following results are concerned with 
the relationship of mental health and the 
ability to predict personality characteristics 
of others. As can be seen in Table 1 there is 
no significant relationship between the ability 
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to predict personality characteristics of others 
and self-acceptance (self—ideal-self correla- 
tion) or soundness of the judge, and, um 
fore, both Hypotheses II and III have no 
been substantiated. —. 

Although it has been shown that nil 
of personality characteristics of judge . 
social object is not related to accuracy ve 
prediction on the part of the judge, othe 
factors might prevent a simple expression id 
this relationship. For example, a judge W 4 
is similar to a social object may predict i 
curately only if he is self-accepting (i.e. nis 
self-ideal-self correlation is high). With k | 
in mind, a triple classification analysis a 
variance for judging ability was carried To 
The judges were divided into a high and i 
similarity group as well as a high and let 
self-acceptance group (self-ideal-self coe ! 
tion), for each of the four social one 
Judges above the median were consi wer 
high and those below the median alt 
considered low on both similarity and 5 
acceptance. 

Because of the varying number of obs ‘jot 
tions in the cells of the triple classifica ie 
it was necessary to pool the residual is a 
triple interaction sums of squares. This acei 
conservative estimate of the error varia jing 
if the triple interaction is significant, P° 

to 
r 
$ 
É 


erva 


„oduc? 
it with the error variance would only p i 
an overestimate of this term and te 
produce smaller F ratios. 


TABLE 2 ‘te po 

ANALYSIS OF VARIANCE OF JUDGING ABILITY, B at 

oF SONLARITY AND SELF-ACCEPTANCE, FO 
Four SocIAaL OBJECTS 


M | 


MS 


3 
656 1 656 “g, 
167 L 167 08 | 


Source of Variation SS df 


Similarity (Sim) 
Self-Acceptance (SA) 


Social Objects (SO) 92,219 3 30,740 


5 
Sim X SO 957 3 319 1%, 7 
SA X SO 761 3 254 nol 
Sim X SA 3,693 1 3,693 
Sim X SA X SO 
and Residual 80,836 387 209 
Total 179,289 399 ? 


* p <.01. l 


i 


~ 


P 
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TABLE 3 


Accuracy AND “FAVORABLE” ADJECTIVES 
For EACH OF THE SOCIAL OBJECTS 


Percentage 
of Gough’s 
“Favorable” 
Adjectives 
Used by 
Social 
E Objects 
Social Mean in Their 
Object Accuracy Self-description 
I 46.24 42 
H 23.96 30 
unr 33.51 31 
Iv 64.51 48 


ran analysis of variance shown in Table 2 
object two significant mean squares: social 
arity S, and the interaction between simi- 
underst: nd self-acceptance. In attempting to 
Squar and the basis for the large mean 
e e for social objects, it was noted that 
tended descriptions of the social objects 
thi to be quite favorably toned. A measure 
UA favorable tone was derived from 
ee R list of “favorable” adjectives. Gough 
y St ote 3) was able to derive the list 
Tate the = a group of 30 college students to 
abilit € entire list of adjectives for its favor- 
for th. + The highest rated 25% were selected 
immedi eV orability key. One question which 
is g ALAF comes to mind is whether there 
tion, o ationship between accuracy of predic- 
hess a the part of the judge, and favorable- 
Socia] self-perception, on the part of the 
avora ct If one compares the list of 
Score + © adjectives with the mean accuracy 
One nope cack of the social objects, in Table 3, 
the aay a perfect correlation (rho) between 
Predictic variables. Thus, the accuracy of 
favorab, n appears to depend, in part, on 
Dject, eness of self-perception of the social 
a Uy 
in n Second significant mean square noted 
lant >€ 2 is the interaction between simi- 
the re aud self-acceptance. To further clarify 
“Onto ronship, the accuracy scores of judges 
Sais to the four possible combinations 
Drege ty and self-acceptance, under the 
Conditions, are shown in Table 4. 
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Two groups can be considered relatively 
accurate: judges similar to a social object and 
self-accepting, and judges dissimilar to a 
social object and not self-accepting. The 
other two groups can be considered relatively 
inaccurate: judges similar to a social object 
and not self-accepting, and judges dissimilar 
to a social object and self-accepting. 


DISCUSSION AND CONCLUSIONS 


What are the implications of Sullivan’s 
statement that one judges others only in 
terms of one’s self? It obviously creates a 
huge solipsism in which no distinction is 
drawn between one’s self and others in the 
judgment process. The lack of relationship 
exhibited between similarity of judge and 
social object and accuracy of prediction 
(Hypothesis I), in the present experiment, 
suggests that the prediction of personality 
characteristics of others is not a reflection of 
the judge, a simple attribution to another of 
the traits of the judge. 

In addition, no relationship was found, in 
the present study, between the mental health 
of an individual—using (a) a self-rating 
criterion, self-acceptance (self-ideal-self cor- 
relation); and (b) an external criterion, 
judged soundness of an individual by expert 
raters—and his ability to predict personality 
characteristics of others (Hypotheses II and 
III). 

When, however, both similarity and self- 
acceptance (self—ideal-self correlation) inter- 
act, there is a significant effect on accuracy. 
This result can be understood if we make 


assumptions concerning the psycho- 


some e 
ese variables. Let 


logical meaning of each of th 


TABLE 4 


MEAN ACCURACY OF THE INTERACTION ON HIGH AND 
Low Scores ON Born SIMILARITY 
AND SELF-ACCEPTANCE 


Group 'N Mean SD 

High Sim and High SA 120 46.18 20.07 
High Sim and Low SA 80 39.08 22.40 
Low Sim and High SA 80 37.49 21.23 
120 42.97 21.93 


g” Sim and Low SA 


Note.—High =Judges above median on Sim or SA; Low = 
judges below median. 
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us assume that a judge who is relatively self- 
accepting will tend to judge others as similar 
to himself. The unconscious basis for his 
predictions might be as follows: “I am a nice 
person and so are other people.” A judge who 
is relatively low on self-acceptance, however, 
will tend to judge others as dissimilar to 
himself. Here, the unconscious basis for his 
predictions might be: “I am not a nice 
person, but others are.” The mechanisms 
destribed here are similar to those character- 
ized by Cameron and Magaret (1951) as, 
respectively, assimilative and disowning pro- 
jection. 

As was described above, there are objective 
measures of the similarities of judge and 
social object. If we focus our attention on 
those judges who are high on self-acceptance, 
we can hypothesize that they would be ac- 
curate in the prediction of personality char- 
acteristics of those social objects similar to 
themselves, and inaccurate in the prediction 
of personality characteristics of those social 
objects dissimilar to themselves. This would 
logically follow since, as postulated above, a 
self-accepting judge would tend to perceive 
others as similar to himself. Those actually 
similar to him would, therefore, be judged 
accurately, while those dissimilar to himself 
would be judged inaccurately. 

In a similar fashion, we would predict that 
judges with low self-acceptance scores would 
accurately perceive those social objects who 
are dissimilar to themselves, while inaccu- 
rately perceiving those social objects similar 
to themselves. This, too, would logically 
follow since, as postulated above, judges with 
low self-acceptance scores would tend to per- 
ceive others as dissimilar to themselves. 
Those social objects actually similar to these 
judges would, therefore, be perceived inaccu- 
rately, while those dissimilar to them would 
be perceived accurately. 

The post hoc explanation invoked here 
conforms to the results in Table 4. While the 
results presented suggest that judging the 
personalities of others involves more than an 
attribution of one’s traits to another, it also 
suggests that judges engage in a mechanical 
type of prediction based on their self-concepty 
If the judges are self-accepting, they judge 
others in a similar manner. If the judges are 
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not self-accepting, they judge others in a 
contrary manner. Accuracy, therefore, de- 
pends not only on whether an individual is 
self-accepting or not, but also on whether | 
the judges are similar or dissimilar to the 
social object. Bronfenbrenner (1958) has 
made a similar point. He utilized the variable 
“Favorability toward others” in his analysis, 
however, rather than self-acceptance used in | 
the present study. 

Looked at from this point of view, the 
ability to predict personality characteristics 
of others should not be perceived as a trait 
residing within the judge, but, rather, as 4 
based on the personality patterns of both 
judge and social object. This point of yip i 
is further substantiated by an analysis 0 
variance of judging ability carried out by 
the author (Spanner, 1955), in which pe 
interaction between judge and social objec i 
approached significance at the .05 level. fe | 

This approach suggests, therefore, that x 
question to be asked will no longer exclusive yi 
be “Is he a good judge of personality? bu 


“Whi i i judge 
also “Which social objects can he Jiem 


the 
areas 
eac! 


well?” Another facet of the same pro 
would be an attempt to break down 
evaluation of personality into various 
and to assess the ability of judges ™ 
area. This presupposes a representative ik 
sign of the sort suggested by Bruns™ 


(1947), in which there would be not a 
representative sampling of subjects (judges e 


but also of objects (social objects). In «38 
present instance a preliminary attempt i 
made to analyze the different skills inv? 5 
in judging social objects with favorable g, 
compared to unfavorable self-percep!° A 
The attempt was abandoned, howeve"», ts 
cause of the small number of social obje 
(four). „pedi 

Possibly if some of the aforemention i, 
approaches are handled successfully, eV? ef 
ally, we will then be in a position to ans ity 
the following question: “How does the ab ae 
of a judge to predict the personality chany | 
teristics of others relate to his own per gd 
ality dynamics?” This, of course, is rela pe 
to the entire area of ego psychology a” a 
differential utilization of ego defenses P3 „e 
individual in judging the personality chat 
teristics of others. 


e 
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SUMMARY 


One hundred military officers were pre- 
Sented with sound movies of stress interviews 
with four interviewees (social objects) and 
asked to predict the responses of the social 
objects on an adjective check-list, as well as 
to describe themselves on the same instru- 
Ment. From these basic data measures of 
accuracy of prediction and similarity of judge 
os Social object were obtained. Measures 
tion), self-acceptance (self-ideal-self correla- 

and “soundness” of the judges were 
also obtained. 
oly to expectations, accuracy was 
and s a be unrelated to similarity of judge 
Social object, to self-acceptance, and to 
ae soundness of the judge. Accuracy 
tion aks related, however, to a combina- 
ables pe arity and self-acceptance vari- 
lative he concepts of disowning and asim 
Expla Projection were utilized as a post No 
Anation of accuracy. 
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The frustration-aggression hypothesis has 
frequently provided a theoretical framework 
for explaining the phenomena of crime, delin- 
quency (Dollard, Doob, Miller, Mowrer, 
Sears, Ford, Hovland, & Sollenberger, 1939), 
and prejudice (Allport, 1954; Dollard et al., 
1939). Two prevalent and pervasive sources 
of frustration which are seen as motivating 
anti-social behavior are low socioeconomic 
status and minority group membership 
(Dollard et al., 1939). Evidence to support 
this position may be adduced from the work 
of Glueck and Glueck (1950) and Suther- 
land and Cressy (1955), who show clearly 
the relationship between poverty and delin- 
quency. On the other hand, Hammer (1953), 
McCary (1950), and Mussen (1953) have 
demonstrated that minority group members 
reveal more manifest aggression than do 
members of majority groups. The factors of 
poverty and discrimination have been con- 
founded in these studies, thus failing to 
clarify the role of each. It may be hypothe- 
sized, however, that minority group member- 
ship and low socioeconomic status combine 
to effect a greater manifestation of hostility 
than would simply result from the frustra- 
tions of poverty alone. Delinquents who are 
members of a minority might, therefore, be 
expected to reveal more evidences of hostility 
than delinquents who are majority -group 
members. The present investigation was 
designed to assess this position, utilizing 
Spanish-American and non-Spanish, white 
delinquents. That the Spanish-American 
group is a negatively valued minority has 
been pointed out by Jones (1948) and 
Saunders (1954) in their writings concerning 
the Spanish- or Mexican-American, and his 
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status among Anglo-Americans. Specifically; 
the hypotheses tested were: 


1. A group of Spanish-American acl 
quents of low socioeconomic status will a 
significantly greater evidences of aggression 
in both amount and kind, than will be demon 
strated by a similar group of non-Spanis? 
white delinquents. 

2. Because of tendencies on the ae 
delinquents to give a “good impressio 
personality measures (Lindzey & Goldwy” 
1954; Vane, 1954), it will be necessary a 
correct hostility measures for social des 
ability. 

3. Social desirability will relate neg@ 
to manifest and extrapunitive „measures j 
hostility and positively to intropunitive 
impunitive measures of hostility. 


art of 
» op 


tively 


METHOD dú 
Subjects. The original group studied consiste at 
81 male and female delinquents on probat! were 
the Denver Juvenile Court. The subjects wert 
selected in such a way as to eliminate all wa pad 
the result of mixed group backgrounds, we othe 
been diagnosed as neurotic or psychotic, were azo! 
than lower class, or who had been institution” re$ 
in the last 2 years, On the basis of obtaining pje 
of 10 or more on the MMPI Lie scale, 7 s of 
were eliminated, The final group thus consis”, 
25 Spanish-American males, 12 Spanish- 
females, 25 non-Spanish white males, ani 
Spanish white females. All subjects were ; 
14 and 17 years of age. w” 
Tests. The tests administered were the Rosa : 
Picture Frustration Study (Rosenzweig, pier cot 
Clarke, 1947), the Siegel Manifest Hostility cal 
(Siegel, 1956), the 39-item Social Desirabilll® "yp 
extracted from the MMPI by Edwards (1957) 953); 
the MMPI Lie scale (Hathaway & McKinley) i g a 
Procedure. All subjects were tested in 8"? e! s, 
from 5 to 15, in the courtroom of the at o 
Juvenile Court, All subjects were assured os an 
testing had nothing to do with their probatio’ 
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t 

per the results would not be made available to their 

ation counselors. All subjects were cooperative 
answered all questions without difficulty. 


` RESULTS AND DISCUSSION 


Tea 1 contains the means and standard 
ee ue the four groups on the various 
i a istered, Pearson product-moment 
nl ae were computed between the 
es esirability scale and the other meas- 
ne as seen in Table 2. Hypotheses 2 and 3 
in support as the Social Desirability 
fest SE related —.655 with the Siegel Mani- 
Dunit; Ostility scale, —.372 with the extra- 
scores. y Scores, .222 with the intrapunitive 
ny, > and .262 with the impunitive scores. 
a first two coefficients are significant at 
a ay level of confidence, and the last two 
e .05 level, with 72 degrees of freedom. 
Ne-way simple classification analyses of 
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variance and covariance were computed be- 
tween the four groups: Spanish-American 
males and females, and non-Spanish white 
males and females. This analysis was more 
conservative than a group X sexes design 
would have yielded. The latter was computed 
in the analysis of variance for a 2 X 2 dis- 
proportionate subclass numbers design as 
recommended by Snedecor (1956), in order 
to check for further significance than might 
have been obtained via the single variable 
design. No additional significance was ob- 
tained for the group, sex, or interaction terms. 
Because of the problem of interpreting an 
analysis of covariance for the 2 X 2 design 
with unequal and disproportionate subclass 
numbers, it was decided to employ the more 
conservative analysis reported here. 

The analyses of variance across groups 
were significant at the .05 level for social 


TABLE 1 


MEANS AND STANDARD DEVIATIONS OF 


AGE, SOCIAL DESIRABILITY, AND MANIFEST HOSTILITY 
AND IMPUNITIVE SCORES FOR ALL GROUPS 


= AND OF EXTRAPUNITIVE, INTROPUNITIVE, 


Group? 
J SA-F NSW-M NSW-F Total 
f Tot 
= Variables = D Eee, (N = 74) = 
Age eee 
f 3 7 15.0 14.5 14.8 
sD a ar 80 76 76 
- 4 24.0 
23.6 24. i 
= a, oo 618 3.50 5.62 
5, 88 
MH 
19.6 19.3 21.1 
= fae ati 8.01 9.14 771 
SD 6.62 . : 
: 9.0 8.6 
8.2 K 8. 
Mea 8.5 9.2 3 3.20 3.42 
so” aos 3.96 3.58 
i 6.5 6.6 6.7 
Mean 71 6.4 24 2.59 2.40 
SD 2.33 OAT, 
M r 3 
z 77 a Ja 
ea! 7.1 ca 277 1.58 2.58 
: 2.23 3. 
Impi Sp sxtrapunitive Scores; I, Intropunitive Scores; M 
Pun? ref, H tility scale; E Estran He 7M, 
pa fers to Social Desirability scale; MH, Manifest Hostility ANAR a i 


e 
M pares. 
s] el ; p 3: 
ish, whas to Spanish-American males; 
hite females. 


SA-F, Spanish-, 


American fem 
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TABLE 2 


CORRELATIONS BETWEEN THE SOCIAL DESIRABILITY SCALE AND THE Manirest Hostitity SCALE, AND 
THE EXTRAPUNITIVE, INTRAPUNITIVE, AND ImpuNITIVE SCORES ON THE ROSENZWEIG 
PicturE-FRUsTRATION STUDY 


Group 
SA-M SA-F NSW-M NSW-F Total® 
Variable (N = 25) (N = 12) (N = 25) (N = 12) (N = 74) a 
SD and WH — .648** —.627* —.706** - 0557 
SD and E =.110 —.581 —.466 = 372" 
SD and I 136 421 -288 222" 
SD and M 104 .588* AaS 262 


aes à i i e basic 
^ The r's for the total group are not based on an averaging of the subgroup coefficients but on direct computation from the | 
data. All subgroup coefficients in the first three rows are homogenous. 


is at best a conse: 
* Significant a 
*** Significant 


nate of this relationship. 


101 level. 


desirability (see Table 3), and not significant 
for the Manifest Hostility scale (see Table 
3). No significance was obtained for the 
analyses of extrapunitiveness, intropunitive- 
ness, and impunitiveness (see Table 3). Be- 
cause of the significant differences among the 
groups in social desirability and the correla- 
tions noted above, analyses of covariance 
were computed among the groups on the 
hostility scores, while adjusting the means for 
social desirability. Significance was attained 
for the Manifest Hostility scores, but not for 
any of the other measures (see Table 3). 
The adjusted means computed for the Mani- 
fest Hostility scale are: Spanish-American 
males, 24.3; Spanish-American females, 19.8; 
non-Spanish white males, 19.3; and non- 
Spanish white females, 19.7. 

It is apparent that Hypothesis I is partially 
supported since the Spanish-American male 


TABLE 3 


F RATIOS BETWEEN Groups FOR ORIGINAL MEANS 
AND MEANS ADJUSTED ror SOCIAL DESIRABILITY 


Original Adjusted 
Means Means 
Variable (df = 3.70) (df = 3.69) 
SD 3.327* - 
MH 1.011 3.240* 
E SL <i 
I <1 <a 
M <1 <i 


* Significant at .05 level. 


d 
of 262 
Those in Row 4 are heterogenous and the total z of 


group obviously manifested more hostility on 
the Manifest Hostility scale than did any ° 
the other groups. i 

It is not clear why the Spanish-America” 
females did not reveal more evidences of a 
tility than did either of the non-Spanish whi - 
groups. It is noteworthy that the significant 
obtained among the groups on the ae 
Desirability scale is probably, in part, as 
to the fact that the Spanish-American femalê 
received the lowest mean score on 
variable. ic 

Jones (1948) in a study of the eth™ 
patterns of the Mexican family in the Unite 
States indicates that the home training ee 
girls in this group is radically different f0% 
that given boys. Among other things man” 
festations of hostility are strongly erat 
proved. The boys of this group might thu 
be more similar to the non-Spanish we 
group in background, and it is possible tha 
the Social Desirability scale is thus 
appropriate for the Spanish-American GE 
than for the male. This somewhat differe” 
conception of social desirability plus m 
sures against evidencing overt aggres' | 
would account partially for the low a 
Desirability scale scores obtained by rest 
girls, and failure of the adjusted Mani í 
Hostility scores to increase along with tho 
of the Spanish-American males. re 

The failure of the Rosenzweig picti 
Frustration Study to yield significance 4; 
been noted before (Lindzey & Goldwyn, 19 
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Vane, 1954), and it may be that the P-F 
study Is actually not measuring manifest hos- 
a but rather aggression on a different level 
> the personality than that which might be 
Significant for the present study. The P-F 
A y however, seem sensitive to the 
Manif ẹ of social desirability, as is the Siegel 
thai a Hostility scale, and it is suggested 
Sun uture research involving these two in- 

ents take this into consideration, as was 
done here. 

Another meaningful consideration may be 
athe on the possibility that the lower-class 
tilit re simply supports the expression of hos- 
ete E it necessarily being related to 
sion ae The assumed frustration-aggres- 
Priate ne may therefore be inappro- 
obsery, pis would account for the — 
the faith, on the Manifest Hostility scale anc 
Petheesns of the P-F study to reveal the hy- 
assy ized tendencies since the latter measure 

a the significance of frustration. , 
Ration ets the results of the present investi- 
tion- indirectly lend support to hetnet 
h esin hypotheses by suggesting ma 
Socioecon ition of frustration from ia 
Member nomic status and minority g" p 
Agres "ship may increase the expression or 

en Sion over that which is „observe 
i oY low socioeconomic conditions are 


` Summary anp CONCLUSIONS 

A Y 2 . È; 
e “ Present study was designed to investi 
Of lope elation of hostility to a combination 
Socioeconomic status and minority 


Bro: f 
hy Dothecmbership. The frustration-aggression 
‘eran, SIS Was employed as a theoretical 
i t and it was hypothesized that the 


Te; 
s : othe j 
ed frustration of minority group mem 


latus ìn addition to low socioeconomic 
Postini Would produce more manifestations 0 
frou y than would be observed in majority 
wieth Members of similar class levels. It was 
lypothesized that tendencies to give 
> iy Mmpression” would relate negatively 
cigg Mest hostility and extrapunitive tend- 
t A ile relating positively to intro- 
i basis and impunitive expressions. On 
th S Of the latter considerations, it was 
Obtained hostility scores would have 
Justed to minimize such biases. 
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Eighty-one Spanish-American and non- 
Spanish white delinquents on probation 
served as subjects. All subjects were adminis- 
tered the Siegel Manifest Hostility scale, the 
Social Desirability scale derived by Edwards, 
the Rosenzweig Picture-Frustration Study, 
and the Lie scale from the MMPI. Seven 
subjects who obtained scores of 10 or more 
on the Lie scale were eliminated from the 
study. Significant negative correlations were 
found between the Social Desirability scale 
and the Siegel Manifest Hostility scale, and 
the extrapunitive scores from the Rosenzweig 
Picture-Frustration Study. Significant posi- 
tive correlations were obtained between the 
Social Desirability scale and the measures of 
intropunitiveness and impunitiveness. Once 
the hostility means were adjusted to remove 
the effects of social desirability, significance 
was obtained between the groups on the 
Manifest Hostility scale. The Spanish-Amer- 
ican male group was shown to manifest sig- 
nificantly greater hostility on this measure 
than any other group, thus partially support- 


ing the main hypothesis. 
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Most self-report tests of personality obtain 
i ane of male and female sex role 
scales gt through the use of like-sex 
is typ; ex role identification or the lack of it 
Sore evaluated by the subject’s re- 
Valid A to a scale of items constructed and 
MecKi sd on his own sex (Hathaway & 
ng fy, 1943; Strong, 1943). Thus, a 
his “a obtaining an average or high score on 
is tte Sex scale is said to be identified with 
itite Sex, and a person obtaining a low 
identifica his own sex scale is said to lack 
like ation with his own sex, or to be more 
€ opposite sex in his responses. In an 
Paper the authors described the deri- 
e Of a scale for boys and girls between 
Subjes of 8 and 12 years in which each 
Scale Ct received a score on an opposite-sex 
& ¢ 2S Well as on a like-sex scale (Rosenberg 


an 


earlier 
Vation 


these. ton-Smith, 1959). The existence of 
Chines for measuring masculinity and 
the “Siew has made it possible to investigate 
Scale „ -tive effectiveness of both the like-sex 
natin and the opposite-sex scale for discrimi- 
8 Sex role identification. 

Who „> generally assumed that individuals 
iden i faulty or aberrant in their sex role 
tur, Cation will be more emotionally dis- 
Term than those who are not (Henry, 1948; 
Should & Miles, 1936). Therefore, they 
B sue i to have less satisfactory scores on 
ami es of emotional stability. This paper 
'Ypeg -S the relationship between various 
dinig, °! Scores on a masculinity and a femi- 
i ie and several independent measures 

Ly Netional stability. The independent 
Shot Study was facilitated by a grant from the 


K ar] A 
State Ty 4 dvancement Committee, Bowling Green 


niversity, 


measures of emotional stability make it pos- 
sible to explore not only the existence of 
sex role confusion, but also certain of its 


qualitative aspects. 


METHOD 


A group of 337 children in the fourth, fifth, and 
sixth grades in two elementary schools in Northwest 
Ohio and Southeast Michigan comprised the sample.? 
Several instruments were administered to the entire 
group: (a) a check list of games and play activities 
which is described elsewhere (Rosenberg & Sutton- 
Smith, 1959) and which effectively yields measures 
of masculinity and femininity, (6) an empirically 
derived scale which has been found to measure ex- 
tremes of impulsive behavior (Sutton-Smith & Rosen- 
berg, 1959), (c) the Children’s Manifest Anxiety 
scale which affords a measure of anxiety (Castaneda, 
McCandless, & Palermo, 1956), and finally (d) the 
Brown Personality Inventory (Brown, 1934). The 
latter scale, derived in 1934, has been found to dis- 
criminate neuroticism in children, and to be highly 
correlated with the CMA scale (Rosenberg, Sutton- 
Smith, & Morgan, in press). Its addition provides 
subscores which reveal the source of major concern 
for neurotic children (home, school, physical, inse- 
curity, and irritability). The “home” subscale in- 
cludes feelings of parental rejection, parental severity, 
and sibling jealousy. The “school” subscale identifies 
feelings of inadequacy in the classroom. The “physi- 
cal” subscale deals with disturbances in physical well- 
being which appear to be psychogenic in origin. The 
“insecurity” subscale contains items reflecting anx- 
iety, interpersonal inadequacy, and generalized neu- 
rotic concerns. The “irritability” subscale examines 
emotional reactivity, low tension-binding qualities, 
and severe feelings of rejection (Brown, 1934). 

An analysis was conducted of the scores on im- 
pulsiveness, anxiety, and neuroticism of boys and 
girls in the upper and lower quartiles on the 
masculinity and femininity scales. 

“The authors wish to express their indebtedness 


to H. Lehtomaa, R. Knestrict, and the teachers of 
Dundee, Michigan, schools for their assistance in the 


collection of the data. 
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TABLE 1 


Neurotic INDICES or Boys anp GirLs HIGH AND Low on tHE MASCULINITY SCALE 


Group Imp Anx BPI Ho Sch Phy Inf Irr 
- M 971* 1683 2236 3.18 140* 613 5.09 3.36 
High SD 361 7.98 1363 277 1.19 5.05 3.91 1.98 
Boys oy g 48 45 45 45 45 45 45 
ri M 8.26 17.48 24.81 3.76 1.88 688 5.53 = 
wo å Sp 3 8.21 12.56 2.33 1.55 5.28 3.38 2.0 
Bos VOo a 46 43 42 42 42 43 43 
Hien M 8.00%  21.09* 25.17 ad 1.52 6.43 6.36 3.60 
Gis SD = 408 786 13.09 307 1.35 424 356 1.05 
eee 44 45 42 42 42 42 42 42 
Low M 5.92 18.56 2245 3.04 1.43 6.27 5.57 re 
rae SD 266 7.32 1083 241 1.26 4.52 3.34 202 
"e N 52 52 47 47 47 47 47 47 


* Significant at the .10 level or le: 
** Significant at the .05 level or 
*** Significant at the .01 level or le: 


RESULTS 


The performances on the neurotic indices 
of boys and girls high and low on masculinity 
are presented in Table 1. The slight variation 
in the size of the Ns is due to incomplete 
protocol on various tests. The results of 
Table 1 suggest that scoring high or low on 
the masculinity scale is reflected very little 
in independent measures of neurotic behavior 
for boys. The general tendency is for boys 
low on masculinity to appear more anxious 


and neurotic, but the differences ahi 
them and the high scorers fail to achia 
statistical significance. In the case of E 
girls, the results are more substantial. Gi ‘ne 
high on masculinity tend to be more oe 
pulsive and anxious than girls low on mas¢ ir 
linity, and show more neuroticism about the 
home life (parental rejection, etc.). -Ati 
The performances on the neurotic ara 
of boys and girls high and low on feminin? 
are presented in Table 2. The results indic? 


TABLE 2 
Neurotic INDICES or Boys Anp GIRLS Hic AND Low on tHe FEMININITY SCALE p 
Group Imp Anx BPI Ho Sch Phy Inf Irr 
High M 10.33 18.26 2600 387 am zos gage 36l 
ma Soo asr 7.86 13.04 302 124 468 3.64 1.75 
Ys yN g 43 38 38 38 38 38 38 
Low M 8.52 14.31 20.73 3.85 1525.97 4.41 2.85 
Boys SD «3.00 6.67 9.99 2.29 1.52 387 2.68 1.91 
ws y g 42 41 40 40 40 41 41 
: M 7.49 19.86 25.25 4.20 180 6.68 4.80 3p? 
mg Spo 3e 7.13 13.41 3.19 154 458 6.16 2.10 
ir vN n 43 40 40 40 40 40 40 
i M 7.00 20.73 25.77 414 13 670 4.52 3.70 
Gis So a0 679 1124 266 119 439 5783 1.89 
N 48 48 44 44 44 44 da “4 A 


* Significant at the .10 level, or less. 
** Significant at the .05 level, or less. 
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TABLE 3 


Neurotic Inpices or Boys AND GIRLS HIGH AND Low on THE MASCULINITY SCALE AND 
IN THE MIDDLE RANGE ON THE FEMININITY SCALE 


Group Imp Anx BPI Ho Sch Phy Inf Irr 
High M 9.02 16.90 2430 3.60 1.50 6.15 5.50 3.60 
Boye SD 2.96 831 1136 2.62 1.29 4.52 3.40 1.88 

n N 21 21 20 20 20 20 20 20 
Lisi uM 035 2140 29.33 4.06 2.06 8.3 6.94 4.56 
Boys SD 2.88 9.06 13.45 2.29 147 5.33 3.79 1.99 

y 7 18 18 18 18 18 18 18 
High M 7.53 20.60 2171 5 1.64 5.50 5.07 3.93 
Girls SD 4.25 9,36 7.91 2.10 78 3.31 1.76 1.87 

N 15 14 14 4 14 14 14 14 
tay M 5.82 17.78 21.92 3.13 1.17 6.38 5.43 3.38 
Gik SD 27 80S 1146 2777 118 458 39 22 

N 27 27 24 24 24 24 24 24 

ee 


th eer 

mar boys who are high on femininity are 
: € Impulsive, anxious, and neurotic than 
YS who are low on femininity, and that 


thej : a 
1t neurotic concerns focus upon physical 


pipo ems, feelings of insecurity, and irrita- 
__: From Table 2 it can be seen that girls 


aoe low on femininity do not differ sis- 
heyroy; >, 12 their performances on the 
TOtic indices. 
paier to examine the possibility that 
ii not the variation in masculine OF 
acount. identification but response-set w i 
tained, for some of the differences 0°- 
ki a further analysis was undertaken. 
aboye Mlysis was similar to that described 
Dossipy: except that in order to decrease t i 
tance tY of response-set being of impor 
high ) ys and girls were chosen who were 
the OF ar low on each scale, but were within 
Scores le range on the other scale (i-€-, their 
the Son the opposite-sex scale were are 
Bron, 22d 75 percentiles). Thus, one of t e 
high S of boys and girls was of those who ha 
low Sa low masculinity scores (per i 
range quartiles), but were within the middle 
the re mt femininity scores. Table 3 presents 
Bitls Sults on the neurotic indices of boys a 
"ange "8h and low on masculinity and a le 
s °n femininity. Though none of the 
boy o achieve statistical significance, bee 
to Masculinity and middle on femininity 
° be more anxious and neurotic than 


boys high on masculinity and middle on femi- 
ninity, with the focus of concerns on the 
physical and inferiority subscales of the 
Brown Personality Inventory. For the girls, 
those high on masculinity and middle on 
femininity tend to be more impulsive and 


anxious, with concerns about physical 


symptoms. 

Table 4 presents the results on the neurotic 
indices of boys and girls high and low on 
femininity and middle range on masculinity. 
There is only one instance in which signifi- 
cant differences are found, but boys high on 
femininity and middle range on masculinity 
tend to be more anxious and neurotic, with 
the focus of concerns about physical, inferi- 
ority, and irritability feelings. For girls, those 
Jow on femininity and middle on masculinity 
tend to score higher on measures of anxiety 
and neuroticism, with concerns surrounding 
the home. Apparently, both scales are of some 
use in the evaluation of sex role identification, 
though this confusion is reflected most clearly 
by the opposite sex scale. It is noteworthy 
that though they do not achieve statistical 
significance (possibly because of the de- 
creased size of the N), a number of the 
differences between the highs and lows in this 
Jatter analysis are of some magnitude, and 
are in the same direction as the scores when 


response-set is not controlled. 


i) 
w 
> 


DISCUSSION 


The results of the present study show that 
the opposite-sex scale has a higher relation- 
ship to various measures of emotional sta- 
bility than does the like-sex scale. This 
suggests that the opposite-sex scales are more 
effective in the discrimination of faulty sex 
role identification. The like-sex scales, though 
discriminating between high and low scorers 
of the same sex, were not equally effective 
in detecting faultiness of sex role identifica- 
tion as reflected in these independent meas- 
ures of emotional stability. Apparently it 
would be most economical to use the opposite- 
sex scale in diagnosing sex role identification. 

The measures of masculinity and femi- 
ninity used in this study required children 
to indicate whether they played or did not 
play certain games. In addition, children were 
required to indicate whether they liked or 
disliked the games that they played. The fact 
that the responses of both boys and girls to 
play and game items of their own sex were 
not significantly related to measures of emo- 
tional stability suggests that the role expecta- 
tions for each sex are fairly explicit in this 
area of like-sex behavior. Apparently, boys 
are clear about what boys are supposed to 
do as boys, and girls are clear about what 
girls are supposed to do as girls, so that 
irrespective of sex role confusion, both sexes 
may make enough conventionally expected 
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responses on the same sex scale for it not 
to be useful as an indicator of sex role 
confusion. On the other hand, the greater 
effectiveness of the opposite-sex scale suggests 
that role expectations of this sort may be 
more ambiguous. That is to say, children arè 
less certain about how much interest they 
should show in the activities of the opposite 
sex. 

Examination of the qualitative differences 
between boys and girls who are high scorers 
on the opposite-sex scale and the independent 
measures of emotional stability shows that 
there are important sex differences in the 
type of sex role confusion yielded by this 
scale. For example, the boys who are high 
scorers on the femininity scale appear to 
show considerably more confusion than "a 
girls who are high scorers on the masculinity 
scale. In previous papers, the authors hav 
shown that femininity in boys is associate? 
with high scores both on anxiety and i, 
pulsiveness (Sutton-Smith & Rosenberg, $ 
press-a, in press-b). Those findings are Oe 
firmed by the present study. In addig 
the fact that the anxiety is expressed ma 
on the subscales insecurity, irritability; ® 
physical problems suggests that these the 
have a very uncertain “self” picture. On A 
one hand they are anxious about their Pa 
sonal and physical selves; on the other, th 4 
are prepared to indulge in impulsive actine 


TABLE 4 


Neurotic INDICES or Boys AND GIRLS HIGH AND I 


IN THE MIDDLE RANGE ON 


2OW ON THE FEMININITY SCALE AND 
THE MASCULINITY SCALE 


i — we a 
Group Imp Anx BPI Ho Sch Phy Inf ie = 
: M 10.13  18.27** 2481 3.44 5 5 3.25 
Hish 1 1.75 7.00 5.25 
Bos SD 3.11 7.40 13.96 3.14 1.44 4.70 2.86 1.39 
N 15 15 16 16 16 16 16 16 
' M 9.50 13.27 20.52 4.19 2.57 
oğ 1 3 ; 1.29 5.19 4.19 aa 
Boys SD * 342 710 ool 272 116 328 2s 18 
N 22 22 21 21 2 21 21 21 
‘ M 7.20 19.67 25.00 3.62 1.74 67 79 3.58 
a SD 1.91 6.70 12.38 3.22 1.58 3.66 387 2.05 
N 20 21 19 19 19 19 19 19 
’ M 7.00 21.24 27.19 4.90 1.67 5 3.67 
a SD 898 Tio 1048 2.36 3 e 
N 21 21 21 21 21 2 2 21 


** Significant at the .05 level, or less. 
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Sex Scales and Psychosexual Deviancy 


out behavior (“I like to throw snowballs,” 
“I like to chase fire engines”); and in addi- 
tion, they show a greater than normal prefer- 
ence for feminine play activities and games. 
While this empirically derived picture is a 
paradoxical one insofar as impulsiveness and 
femininity would appear superficially to be 
contraries, it is similar to a finding of Hartley 
(1959) based on interview materials that 
there are some boys who choose such a 
feminine-impulsiveness as a defense. There is 
Some agreement then that at least one im- 
Portant type of sex role confusion in children 
of this age group is to be found manifest in 
this feminine-impulsive syndrome. While the 
dynamics lying behind this syndrome are not 
at present apparent, the authors favor the 
view that impulsiveness is a means of warding 
off anxiety about sex role deviancy (abnormal 
identification with feminine role preferences) 
through pseudomasculine acting out behavior 
(Sutton-Smith & Rosenberg, in press-a). 
The highly masculine girls do not show 
quite the same inconsistencies in their various 
Scores. Their high scores on the impulsiveness 
Scale (on which males have higher average 
Scores than females) are consistent with their 
high scores on the masculinity scale. Not only 
are the girls’ choice patterns consistent, they 
are also more culturally acceptable than the 
choice patterns of boys scoring high on fem- 
ininity, Tt is not unusual for girls to choose 
Masculine type activities in Western culture 
\ rown, 1958); in fact it is increasingly suit- 
€ for them to do so (Rosenberg & Sutton- 
Smith, 1960: Sutton-Smith & Rosenberg, 1n 
Press-c), Nevertheless, the presence of high 
ey scores in these girls suggests boi 
Pind Masculine identifications do ee 4 
Me conflict. Apparently, it is perceive 
; Conflict between themselves and their a 
hi (higher scores on the home susan 
a are seen as severe and rejecting, a 
sib a conflict in their own self picture. en 
Si Y, their masculine assertiveness may 
™ of defense against parental rejection. 


SUMMARY 


Tar Present study sought to examine m 
S tive effectiveness of like-sex anr DERESE A 
tig, cales in discriminating sex role identi 
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ach subject received scores on like a 
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well as opposite-sex scales, and three inde- 
pendent measures of emotional stability. Ap- 
parently, children scoring high or low on a 
like-sex scale do not differ significantly on 
measures of emotional stability. Children 
scoring high on opposite-sex scales tend to be 
more anxious, impulsive, and neurotic than 
children low on opposite-sex scales. Explana- 
tions of the differing symptoms of such sex 
deviant boys and girls are offered. From the 
results of the present study, there is some 
doubt that like-sex scales, as they are tradi- 
tionally used, are as effective as heretofore 
suspected in discriminating sex role iden- 


tification. 
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There has been much interest in the effects 
of anxiety on performance in recent years. In 
a series of studies Taylor, Spence, and others 
at Iowa showed that subjects (Ss) with a 
high anxiety level performed better on a 
variety of simple tasks than Ss low in anxiety 
(Spence & Farber, 1953; Spence & Taylor, 
1951; Taylor, 1951; Wenar, 1954). With 
more complex tasks, anxiety level tended to 
have the opposite effect, low anxiety Ss per- 
forming better (Farber & Spence, 1953; 
Maltzman, Fox, & Morrisett, 1953; Monta- 
gue, 1953; Ramond, 1953; Taylor & Recht- 
shaffen, 1959; Taylor & Spence, 1952). These 
results were explained in terms of Hull’s 
theory of learning (Taylor, 1951; Taylor & 
Rechtschaffen, 1959; Taylor & Spence, 1952). 
Child (1954) criticized this theoretical ap- 
proach for concentrating on the character- 
istics (simplicity or complexity) of the task 
while ignoring the category of responses Ss 
learn to make to the cues provided by their 
own anxiety. 

Sarason and his colleagues have presented 
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a more complex theory for explaining the role 
of anxiety in performance. Their theory pre- 
dicts different effects from small and large 
amounts of anxiety, and takes into account 
the S’s response to his own anxiety (Mandler 
& Sarason, 1952; Sarason, Mandler, & Craig- 
hill, 1952; Wiener, 1959). They assume that 
two kinds of anxiety responses may be aroused 
by a testing situation: those which are self- 
centered and ego-defensive, and those which 
are evoked and learned in the course of task 
performance and are directed toward task 
completion. 

Small amounts of anxiety are held to im- 
prove performance by increasing the S’s mo- 
tivation and strengthening his task relevant 
responses. Larger amounts of anxiety lead to 
or strengthen the ego-defensive (Ra) re 
sponses. These responses are characteristic of 
the S rather than specific to the task, always 
available in the response repertoire and read- 
ily evoked. Since Ra responses are self-cen- 
tered rather than task relevant they interfere 
with performance. Individuals with a high 
anxiety drive will have many Ra responses In 
their response repertoire. In contrast, 1n rela- 
tion to the total number of responses avail- 
able, low anxiety Ss will tend to make more 
task relevant anxiety responses. Results ob- 
tained with Yale students as Ss and using 4 
variety of tasks (eg, Koh’s blocks, maze 
learning, digit symbol, motor learning, projec- 
tive material) generally support the theory 
(Mandler & Sarason, 1952; Mandler & Sara- 
son, 1953; Sarason & Mandler, 1952; Sara- 
son et al., 1952; Wiener, 1959). 

Sarason and Gordon (1953) also developed 
a measure of anxiety specific to the test stress 
situation to be studied, implicitly recognizing 
that anxiety need not function as a unitary 
drive, but might be a potential reaction of the 
individual which is only elicited under certain 
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conditions. Sperber (1959) has reported cor- 
relations between the scores obtained by Air 
Force recruits on Sarason’s Test Anxiety 
Questionnaire and their scores on the Taylor 
Manifest Anxiety scale (1953) and Winne’s 
Neuroticism scale (1951). Winne’s interpre- 
tation of his results indicates that his scale 
can be considered a general measure of mani- 
fest anxiety (pp. 120-121). The highest cor- 
relation between the test anxiety and general 
anxiety measures was +.35, suggesting that 
a general anxiety measure might not be an 
adequate gauge of an S’s proneness to anxiety 
in a test situation. Since the interest of the 
present research was in studying how anxiety, 
when evoked, operates to affect test perform- 
‘ance, Sarason’s measure of anxiety proneness 
specific to testing situations was used. 

The purpose of the present study was to 
determine whether there is any difference in 
performance between: (a) high test anxiety 
Ss under high vs. low stress, (b) low test 
anxiety Ss under high vs. low stress, (c) high 
» test anxiety Ss vs. low test anxiety Ss under 
\ high stress, and (d) high test anxiety Ss vs. 
| low test anxiety Ss under low stress. 

A further purpose was to bring additional 

evidence to bear on the different theoretical 
approaches to the problem of anxiety and 
performance presented by Taylor and Spence 
(1952) and by Sarason et al. (1952). 


METHOD 


Stress Environment 
virtue of its ability to 


i i is s l by 3 
A situation is stressful by abjected to the situa- 


induce anxiety in individuals subjec f 
tion, L a Deese, and Osler ad ager hereon 
“The principal problem in the study oe ae 
under stress has been the production ot one ats 
Situations” (p. 295). This experiment took p a 
a naturally stressful testing SI 

re collected at Samp- 


‘“ssociation with 
tion, The experimental data we Oe Gn tar 
son Air Force Base in December 1953, n 4 
recruits as Ss. Prior to testing, an officer Jackúred 
them on the importance of the anahiss homie 
ram they were to undergo. Many 0! : E 
Oped to benefit from the opportunity or som 
Specialized vocational training which would Rd 
getting z assignment for the penoSs 
The E a an in essence, that the energie 
of their hopes would be determined by the results © 


the aptitude testing. 
The recruits were 


then tested on two successive 
days, The first day was devoted to the ri 


egular apti- 
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tude testing program. The second day was scheduled 
for HRRC experimental testing, but the recruits 
were only told to report to the same building for 
more testing. The use of identical administration 
procedures by the HRRC and aptitude testing staffs 
reinforced the Ss’ tendency to view both days’ test- 
ing as serving the same purpose. g 

High stress. Ss were allowed to continue believing 
that the experimental tests, presented as the Abilities 
Test Battery and administered by uniformed military 
personnel,? were part of the official assessment pro- 
gram. The battery consisted of a large number of 
timed subtests. The instructions implied that the Ss 
were expected to finish all of the items in each sub- 
test, incompleted items counting as failures. The time 
limits were so chosen that the Ss could not possibly 
complete all of the items. The very first “test” was 
an interesting puzzle chosen to challenge and involve 
the S, but not solvable in the time allotted (Cowen, 
1952, p. 514). It was included solely for the purpose 
of starting the Ss with a failure experience. 

After completing the battery, a 40-minute inter- 
lude was devoted to procedures presented as not 
related to the “assessment program,” and adminis- 
tered by a civilian. Then the sergeants announced: 
“We're going to repeat two of the tests you took this 
morning. You’ve had practice on them and you 
should improve. Let’s see how much better you do 
this time.” The first test was the puzzle, which again 
provided a failure experience. Ss then repeated the 
Letter Substitution Test. 

Low stress. The instructions were designed to 
make it clear that the Abilities Test Battery was 
experimental, and not part of the regular assessment 
program, but also to indicate the importance of the 
tests to the Air Force and to maintain a high level of 
task oriented motivation for good performance. For 
these Ss the first test was a simple line tracing task 
which started them with a success experience. The 
rest of the battery was the same as for the high 
stress group. The low stress Ss were interrupted the 
same number of times and in the same places, but 
the instructions allowed them to treat unanswered 
items as evidence that the test required changes, 
rather than as signs of personal inadequacy. 

After the 40-minute interlude, instead of receiving 
instructions which implied that an improvement in 
performance was expected, the low stress Ss were 
told: “We're going to repeat two of the tests you 
took this morning, You’ve had practice on them 
which should help, but you're somewhat tired by 
y—so let's see what happens, Do the best you 


now 
ca 


To help underscore the dissociation of the tests 
from the regular assessment program, the test ad- 
ministrator wore civilian clothes. To emphasize that 
the Air Force really had an interest in the procedure, 
the proctor, who was in a less active role, remained 
in uniform. 


2 Stubblebine administered the performance tests 
under both high and low stress conditions, and Swen- 
son proctored all of the testing. 
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Performance Battery ®* 


A series of paper and pencil tests, suitable for 
group administration, and tapping a range of cogni- 
tive functions was used. With the exception of the 
atmosphere reinforcing puzzle or tracing task, high 
and low stress Ss worked on the same tasks, which 
are described in the order they appeared in the 
battery. 

Letier series. An expanded version of Thurstone’s 
(1943) PMA Reasoning subtest, containing the orig- 
jnal 30 items and 20 new ones constructed by the 
writer, was administered in two consecutive parts. 
For each part, containing a random selection of 15 
original and 10 new items, 4 minutes were allowed, 
the parts being scored separately. 

Letter substitution. The items were taken from 
Lazarus and Eriksen’s (1952) expanded version of 
the Form I Digit Symbol test. The Ss were allowed 
3} minutes for 200 items. This test was repeated in 
the post interlude battery. 

Revised Minnesota Paper Form Board. The series 
MB version of this test (Likert & Quasha, 1948), 
which measures the S’s ability to perceive spatial 
relations, was administered in the usual manner. 

Number matching. This test is similar to the num- 
ber comparison sections of the Minnesota Vocational 
Test for Clerical Workers (Andrew, Patterson, & 
Longstaff, 1933), but new items were constructed by 
the writer. It measures the S’s ability to concentrate 
and attend to detail, the items requiring the S to 
make rapid discriminations of small differences. Each 
page contained 100 items and was timed and scored 
separately. There were three consecutive parts to the 
test, with 43 minutes allowed per page. 


Test Reaction Questionnaire (TRO) 


After the performance testing, the experimental 
purpose of the tests was explained to the Ss. Then a 
22-item questionnaire, constructed by the writer and 
similar in format to the Test Anxiety Questionnaire 
(Sarason & Mandler, 1952), was administered to 
ascertain the S’s reactions to various aspects of the 
testing situation. The S put a mark on a 16-centi- 
meter line to indicate the strength of his reaction, the 
ends of the line representing polar responses to the 
question. Each question was scored by measuring the 
distance in centimeters from the end of the line to 
the S’s mark. 


Personality measures 


A slightly modified version of Sarason’s Test Anx- 
iety Questionnaire (TAQ) ® was administered to the 


8] wish to thank Robert H. Bauernfeind, Editor 
of the Test Department, Science Research Associates, 
for permission to use the Letter Series items from the 
PMA (Thurstone, 1943), and Richard S. Lazarus for 
making available the extended version of the Digit 
Symbol test. 

4 Copies of the TRQ may be obtained from the 
writer. 

5] wish to thank Seymour B. Sarason for making 
available a copy of his scale, and granting permission 
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Ss after the TRQ. Following Sarason and Gordon’s 
(1953) suggestion, local norms, based on the re- 
sponses of both high and low stress Ss, were used in 
scoring each question. The resulting distribution of 
scores was very similar to that reported for Yale 
students (Sperber, 1959). 


Other Procedures 


The Minnesota Multiphasic Personality Inventory 
(MMPI) was administered in an afternoon session. 
Also available for each S was his stanine score on the 
Technician’s Specialty Index (TSI) (Dailey, Lecznar, 
& Brokaw, 1948), considered to be the military test 
which best measures general intelligence level.ë For 
most Ss, raw scores on the Armed Forces Qualifica- 
tion Test (AFQT) (Bolanovich, Mundy, Burke, & 
Falk, 1953), another gauge of intelligence level, were 
also available. 


Subjects 


All 399 new recruits who arrived at Sampson AFB 
to begin basic training the week before this experi- 
ment was conducted constituted the original sample 
tested. The Ss were organized in six “flights” with 
about 65 men in each. Three flights were arbitrarily 
assigned to each of the stress conditions. Tests were 
administered to one or two flights at a time, 201 men 
performing under high and 198 under low stress 

Recruits with a TSI in the lowest two stanines, or 
an AFQT score less than 15 were considered too low 
in intelligence to participate, and their data were not 
analyzed. Twenty percent were eliminated by these 
criteria, Ss whose MMPI records did not meet ac- 
ceptable standards on the validity checks (Hathaway 
& Mechl, 1951) were also dropped, leaving 146 high 
stress and 148 low stress Ss. These men supplied the 
data on which were based the norms for scoring the 
TAQ, the correlations between test anxiety and gen- 
eral anxiety, and the analyses of reactions to the 
experimental conditions (TRQ responses) . . 

The performance of only those Ss who scored in 
the highest and lowest quartiles on the TAQ was 
studied. High test anxiety (HTA) Ss had scores ot 
24 or higher, low test anxiety (LTA) Ss scored 11 
or less. There were 61 Ss under high stress, 32 HTA 
and 29 LTA. Under low stress there were 71 Ss, 33 
HTA and 38 LTA. These four groups were well 
matched with respect to age, years of education, and 
intelligence. For each group the mean age was close 
to 19, mean years of education was about 11, and 
the mean TSI stanine was approximately 5.5. 


RESULTS 
Effectiveness of the Experimental Operations 


_ The distribution of responses of the 146 
high stress and 148 low stress Ss to each of 
the questions of the TRQ were compared by 


for such changes in wording as were necessary to 
adape it for use in a different contest. 
Personal communication from Abraham Carp. 
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means of the Marshall test (Smith, 1953). 
The high stress Ss differed significantly from 
the low stress Ss in being convinced that the 
testing would decide their Air Force assign- 
ments. The low stress Ss more strongly ac- 
cepted the idea that testing was for experi- 
mental purposes. They differed significantly 
(p < .05) on 11 of the 22 questions. High 
Stress Ss felt more anxious, a greater pressure 
to do well, and a greater degree of unpleas- 
antness in the testing situation. The two 
groups did not differ in their evaluation of 
how much their emotions and anxieties had 
interfered with performance. 

The differences in TRQ responses of Ss 
Scoring above and below the median on the 
test anxiety measure were tested separately 
for Ss who had performed under high stress 
and Ss who had performed under low stress. 
Under high stress, HTA and LTA Ss re- 
sponded differently to 10 of the 22 questions 
(Marshall test, p<.05). HTA Ss more 
Strongly believed that their Air Force assign- 
ment depended on their performance, felt 
more anxious, thought their emotions influ- 
enced their performance adversely, and con- 
sidered the testing unpleasant. The groups 
did not differ with respect to test motivation. 

Under low stress, HTA and LTA Ss re- 
sponded differently to 4 out of 22 questions. 

here was a stronger tendency for HTA Ss 
to believe that their Air Force assignment 
might be at stake, and they responded with 
More anxiety to various aspects of the situa- 
tion. There were no differences in motivation, 


feelings about the pleasantness of the situa- 
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A 
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Mean a Mean ¢ 
e 2 5700 27 
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Substitution A 21.5 F680 
Naber Form Board FA 0 10.8 
Number Matching 1 14 S9 10.9 
Number Matching 2 10:2. Say 446 
Lumber Matching 3 I 303 ITO 
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igence (TSI) 


df = 40, 
" High St varial 
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* p 18h Stress Ss were more variable 
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F =2.53, p <.05)- 
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TABLE 2 


PERFORMANCE DIFFERENCES OF Low Test ANXIETY 
SUBJECTS UNDER HicH ys. Low STRESS 


High Stress 


Test Mean e 


4.9 

6.2 

98.0 

39.6 

55.6 

55.9 

Number Matching 3 55.9 
Letter Substitution B 133.5 
Intelligence (TS1) 5.2 

adf = 65. 

b Low Stress e more variable (F =2.38, p <.02); £ and 
the value of ¢ significance we: i 
the method di wards (195 

© High Stress Ss were more variable ( 83, p <.05). 

d Low St Ss were more variable (F =2.03, p <.05). 

*p<.05. 


tion, or evaluation of the influence their emo- 
tions had on performance. 


Performance 


The 32 HTA Ss tested under high stress 
performed significantly better than the 33 
HTA Ss working under low stress on the three 
Number Matching tests. Critical ratios of the 
mean differences yielded p values of < .01, 
< .001. < .05. These results, however, were 
based on groups differing not only with re- 
spect to stress treatments, but also in test 
anxiety level. The high stress HTA Ss were 
significantly more anxious than the low stress 
HTA Ss (Marshal test, p = .05). To control 
for this the HTA groups were matched for 
level of test anxiety by a random selection 
procedure (Sperber, 1956, p. 136), 21 HTA 
Ss remaining in each stress group. 

The results of the ¢ tests of the differences 
in mean performance of the matched HTA 
groups working under high vs. low stress is 
given in Table 1. The groups also were ade- 
quately matched for intelligence. Stress clearly 
influenced the performance of HTA Ss, the 
groups tested under high stress performing 
significantly better on two tests, Number 
Matching 1 and 2. 

The results of the ¢ tests of the differences 
in mean performance of the LTA Ss tested 
under high vs. low stress are presented in 
Table 2. The groups were adequately matched 
both for intelligence and level of test anxiety. 
The LTA Ss consistently had higher mean 
scores under low stress. For both reasoning 
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tests the mean differences reached statistical 
significance. 

The ¢ tests of differences in mean perform- 
ance of HTA and LTA Ss, both tested under 
high stress are presented in Table 3. On six 
of the eight measures the HTA performed 
wetter. The only statistically significant dif- 

ference was on Number Matching 2, where 
the HTA group was superior in performance. 
The results on the other two Number Match- 
ing tests supported this finding, both the 
mean differences favoring the HTA Ss, and 
approaching significance. For Number Match- 
ing 1 the value of p was .08, and for Number 

Matching 3 it was .10. 

The critical ratios, testing the significance 
of the differences in mean level of perform- 
ance under low stress of HTA and LTA Ss, 
are reported in Table 4. On all eight tests the 
LTA Ss were superior in performance. Two of 
the mean differences were statistically signifi- 
cant. For three more tests, namely, Number 
Matching 1 and 2, and Letter Substitution B, 
the differences were nearly significant, the 


values of p reaching .06, .09, and 07, re- 
spectively. 


DISCUSSION 


Our findings do not appear to be compat- 
ible either with the Iowa or with the Sarason 
theory. With respect to the Iowa position, we 
find that under high stress, HTA Ss perform 
better on Number Matching than do LTA Ss. 
Here higher drive seems to facilitiate per- 
formance, the correct responses for this task 
presumably being uppermost in the average 


TABLE 3 


PERFORMANCE DIFFERENCES OF HIGH TEST ANXIETY 
AND Low TEST ANXIETY SUBJECTS UNDER HIGH 


STRESS 
HTA 
Test Mean o o h 

Letter Series 1 4.6 2.0 49 2.2 AS 
Letter Series 2 T2 24 62 28 1.11 
Letter Substitution A 106.0 22.8 98.0 190 146 
Paper Form Board 37.5 12.6 39.6 85 +72 
Number Matching 1 61.0 11.0 55.6 11.9 1.81 
Number Matching 2 62.1 10.2 55.9 104 2.31% 
Number Matching 3 61.3 11.7 55.9 130 1.69 
Letter Substitution B 134.8 28.9 133.5 23.3 AS 
Intelligence (TSI) 5.4 L5 5.2 5 154 

adf = 59. 

b HTS Ss were more variable (F =2.19, p <.05). 

*p<.05. 
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TABLE 4 


CRITICAL RATIO OF PERFORMANCE DIFFERENCES OF 


Hicnh TEST ANXIETY AND Low TEST ANXIETY 
SUBJECTS UNDER Low STRESS 
HTA LTA 
Test Mean o Mean ø CR 
Letter Series 1 58 28 6.3 
Letter Series 2 7.1 Ai 7.9 
Letter Substitution A 96.6 17.7 108.9 
Paper Form Board 39.2 7.3 39.8 
Number Matching 1 52.9 10.9 58.4 
Number Matching 2 51.9 121 57.0 
Number Matching 3 541 14.3 62.0 
Letter Substitution B 125.4 18.3 137.6 
Intelligence (TSI) Sa EUS 5.7 


5s were more variable (F =2.73, p <.01); CR nec 
ficance at 


03 (Edwal 


ry 
with the observed heterogeneous 
s, 1950, p. 167-169). 

=3.28, p <.001). 
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S’s response hierarchy. On the same tests 
under low stress, however, LTA Ss perform 
better than HTA Ss, higher drive level ap- 
parently interfering with performance. Since 
the same test is involved in both cases, we can 
not readily attribute the reversal of the effects 
of a heightened anxiety drive to the different 
types of response hierarchies elicited by 
simple as contrasted with complex tasks. 
Taylor (1956) has commented on the many 
characteristics other than drive level in which 
anxious and nonanxious Ss may differ, and 
which may influence performance (p. 303). 
The interest of the Iowa group in their study 
of anxiety has been restricted to the effects of 
drive, although the influence of anxiety drive 
per se on performance has been acknowledged 
to be small (Spence & Taylor, 1953; Taylor, 
1956). Our own results confirm Taylor’s sug- 
gestion that to understand the performance of 
anxious and nonanxious Ss, a broader ap- 
proach to their characteristics is necessary. 
Such an approach will be indicated below. 
The effects of anxiety on the performance 
of our Air Force recruits were consistently 
Opposite to the results reported by Sarason. 
For the Yale Ss, strong anxiety—as mani- 
fested by Ss who were both high in test 
anxiety and tested under high stress—inter- 
fered with performance, but for the recruits 
Studied in the present research strong anxiety 
improved performance. For the Yale students, 
moderate anxiety—as manifested by HTA Ss 
tested under low stress, or LTA Ss studied 
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under high stress—facilitated performance, 
but for our recruits it was associated with 
poorer performance. For the Yale students, a 
near lack of anxiety—such as would be the 
case for LTA Ss studied under low stress— 
was paralleled by a lowered level of perform- 
ance. In contrast, for our Air Force recruits 
this lack of anxiety was associated with an 
improvement in performance. 

A rationale for the performance of our re- 
cruits, and the difference between their per- 
formance pattern and that of Yale students 
can be offered in terms of the S’s assessment 
of the personal importance of the testing sit- 
uation, and the tendency to use avoidance or 
Vigilance as a defense against anxiety, de- 
the importance of a situation. 

Douvair (i ay he demonstrated that an 
S’s evaluation of\the importance of a testing 
Situation will depend on experiences associ- 
ated with his social status. She showed that 
Working class adolyscents become involved 
and strive to do well in a situation where good 
Performance can earn them an immediate 
reward, but are not motivated to do well when 
no direct reward is at stake. In contrast, the 
middle class adolescent appears to be strongly 
Motivated for successful performance 1n both 
Situations. Our Ss were predominantly from 
the working class (Sperber, 1959). pee 
(1953) has shown that a low evaluation o 
ormal education is characteristic of ee 
from a working class background. aise 
the recruits were about the same age aS Lae 
Students, a majority of them never complete 
high school. Their fewer years of education 1S 
assumed not to be simply a function of mid 
intelligence level, but also to reflect a ie 
cism about the value of an academic anes 
ion. This attitude was apparently ahs iE 

eir parents since many of them were ae 
“hough to require parental eae i 
ist. The Yale Ss were predominant Y 052) 

e middle class (Sarason & Mandler, | ras 
Where characteristically there are “er D J 
Sures for academic achievement (McNeil, 
1953 . 

The operation of vigilance, 
àS defenses against anxiety is 
erms akin to Bruner and Postm 

€scription of perceptual vigilance an 
ense ; 


and avoidance 
conceived in 
an’s (1947) 
d de- 


» vigilance. 
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In any given situation the organism singles out 
what it considers to be the environment’s most rele- 
vant aspects—relevant to adaptation in the situation. 
So long as the situation is not too threatening or too 
exacting, avoidance of meaning may be emotionally 
the most economical response. But in situations which 
are highly threatening and highly exacting, the most 
adaptive perceptual response is frequently the one 
which takes most vigilant account of “reality” (p. 
76). 


With reference to the present findings, we 
view the high stress situation, testing which 
the S thought would determine his future, as 
‘one in which the S would be motivated to 
participate to the best of his ability. Partici- 
pative rather than avoidance behavior seemed 
most in the S’s self-interest, and we assume 
both HTA and LTA Ss were involved in the 
performance situation. We attribute the supe- 
rior performance of the HTA Ss to the func- 
tion of anxiety as an internal stimulus which 
could constantly remind them of both the 

* importance and danger of the situation, and 
reinforce their motivation to be alert and per- 
form well. The performance of the LTA 
group did not benefit from this increase in 

This rationale seems especially 

` plausible when we consider the nature of the 
Number Matching tests, on which the signifi- 
cant performance differences occurred. The 
greater anxiety of the HTA Ss would main- 
tain their motivation to perform well on a 
repetitive, boring task, given at the end of a 
long session. An increment in perceptual vig- 
jlance would improve their discrimination of 
small differences. 

Under low stress the LTA group performed 
better. The TRQ data allow us to assume that 
for the HTA Ss the situation was sufficiently 
like one involving real tests, to elicit feelings 
of anxiety. Since the instructions used assured 
the Ss that it was safe to be unconcerned 
about one’s own performance, i.e., no imme- 
diate gain for them was possible, they would 
tend to defend themselves by withdrawal 
when the task became noxious. This need to 
avoid became strong enough to affect perform- 
ance only on the more repetitious tasks used. 

Our analysis of the results of the present 
study has been in terms which are similar in 
many respects to the theoretical formulations 
advanced by Lazarus and his colleagues (€.g., 
Vogel, Baker, & Lazarus, 1958) and by Rue- 
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bush (1960). The present study and other 
recent researches (Ruebush, 1960; Vogel et 
al., 1958; Wiener, 1959) have all presented 
results which point up the complex interac- 
tion of anxiety, motive, defense, and task 
variables. A prediction of whether anxiety 
leads to an improvement or decrement in the 
performance of a given S appears to depend 
on the answers to four related questions: (a) 
Does the S accept the performance situation 
as something so important that he must par- 
ticipate in it, or as something that can appro- 
priately be avoided? (b) Does the anxiety 
aroused in,the S by the situation lead to mo- 
tivations which are more important as deter- 
minants of his behavior than the actual 
amount of anxiety elicited? (c) With respect 
to the direct effect of anxiety on performance, 
what kind(s) of defensive behavior is (are) 
engendered in the S by varying amounts of 
anxiety? (d) What relationship obtains be- 
tween the nature of the S’s defenses and the 
structure and demands of the task? 


SUMMARY 


Zanwil Sperber 
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WECHSLER’S DETERIORATION RATIO IN CLINICAL 
PRACTICE 


T. G. CROOKES 
St. John’s Hospital, Aylesbury, England 


Studies on Wechsler’s Deterioration Ratio 
have mostly suggested that it is of very lim- 
ited value in diagnosing brain damage. None 
of those, for instance, quoted by Yates (1954) 
showed any very convincing differentiation of 
brain damaged from other groups. Even where 
it distinguished brain damaged from “nor- 
mals,” it did not show any appreciable differ- 
ence between brain damaged and other psy- 
chiatric groups (Hall, 1952; Rogers, 1950), 
and it is this latter distinction which it is of 
practical value to make. The usual method is 
to take groups of known diagnoses, apply the 
tests to them, and compare the results. For 
reasons given below, this was not thought to 
be the most satisfactory way in this case, and 
it seemed to be of interest to examine the 
ratios in a group referred for routine purposes 
over a period of time, to look into the distri- 
bution of the values and compare them with 
the final diagnoses. 

The actual ratio used is not exactly the 
same as Wechsler’s. The same four “don’t 
hold” tests are used, but only two “hold” 
tests: Vocabulary and Picture Completion. 
The score on these is doubled, and the per- 
centage loss is calculated in the usual way 
with Wechsler’s age allowances. Object As- 
sembly was omitted because it does not seem 
to be a good hold test even on Wechsler’s data 
(1944, p. 150), and this is a rather embar- 
rassing test for adults, except the duller ones. 
Information was omitted to keep the balance 
of verbal and performance tests. The ratio 
will be referred to as DR. 


PROCEDURE 


Wechsler’s test is applied routinely to almost all 
patients referred for psychological examination at 
the above hospital. It is a regional National Health 
Service hospital, catering for all kinds of mental 
illness. 


The DR was calculated for all male inpatients re- 
ferred over a period of 6 years, whatever the reason 
for referral, providing they had done all the tests 
necessary for the ratio, In all cases it was the 
Wechsler-Bellevue Scale, Form I, and where the test 
had been repeated, the figures of the first testing were 
used. The diagnoses were the final diagnoses made on 
discharge or death, and, where the patient is still 
here, the latest diagnosis at the time of writing sup- 
plied by the psychiatrist in charge of the case. 

There are altogether 261 men, of whom 171 were 
less than 50 years of age at the time of testing, and 
90 were 50 or over. The main comparison is between 
those in whom the diagnosis implies physical inter- 
ference with the brain (the “organic” group) and the 
rest. The under-50 organic group consists of 23 pa- 
tients, 12 epileptics, 3 head injuries, 3 toxic condi 
tions (2 of them alcoholic), 2 cerebral syphilis, i 
confusional state in disseminated sclerosis, and 2 
cases of unspecified brain disease. The over-50 or- 
ganic group contains 22 patients, 12 „with senile g 
presenile dementia, 1 alcoholic dementia, 5 toxic cai 
ditions, 2 cerebral syphilis, 1 head injury, an 
cerebrovascular disease, Toca 

The remainder were divided into five broad dne 

nostic categories, to see if any type of score w 
peculiar to any group, and to see if the results a 
with the finding of Rogers (1950) that all ma od 
justed groups showed higher DRs than normals, 
did not differ among themselves. The under-50 “non- 
organic” group consists of 30 depressives (including 
manics), 45 schizophrenics, 29 neurotics, 33 psycho- 
paths, and 11 paranoid states (including paraphren 
ics). The over-50 nonorganic group contains ~” 
depressives, 1 schizophrenic, 10 neurotics, 9 psycho 
paths, and 8 paranoid states. 


ResULTS 


Table 1 shows the means and ranges of 
DRs for the organic and nonorganic groups: 

In both age groups the organics have CO" 
siderably higher values than the other’: 
(Comparing organic and nonorganic means, 
for the under-50 groups, £ = 4.866, p < 0015 
for the over-50, £ = 4.596, p < .001.) But for 
Practical purposes, one needs to know to what 
extent they can be differentiated by taking 
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TABLE 1 
MEAN DETERIORATION RATIO, ORGANIC AND NONORGANIC 


Under-50 Over-50 
Group N M Range N M Range 
Organic 23 +28.9 +59 to —22 22 +23.0 +57 to —18 
Nonorganic 148 + 8.8 +57 to —42 8 — 04 +42 to —52 


some critical value or “cutting score.” Table 
2 gives the number falling above +20 
(Wechsler’s suggested critical score) and 
above +30, by age group, and whether or- 
ganic or not. . 

If the two age groups are combined, we 
find using a score of 20 that 76.67 are cor- 
rectly classified (organics 64.47% and nonor- 
Sanics 79.2%). Using the score of 30, 85.1% 
are correctly classified (organics 53.370 and 
nonorganics 91.79%). The increase 1 total 
Percentage correct obtained by raising the 
critical score from 20 to 30 is mainly due to 
the fact that the nonorganic group 1S much 
larger than the organic; by raising the score 
the discrimination of the nonorganic group 1S 
improved, that of the organic group 1S made 
Worse, and as the former is so much bigger 
they contribute more to the total percentage. 
However, it will be noted that with the score 
of 30, the mean of the organic and nonorganic 
Percentages (72.5) is slightly greater than 
their mean with the score of 20 (71.8). This 
Mean can be considered as the percentage 
Correct if the two groups are made equal in 
number; so the cutting score of 30 appia 
slightly better. In any case, it is probably 9 
more value in this kind of problem to have a 
more extreme score, which nearly eliminates 


TABLE 2 


y CUTTING 
Proportions DISTINGUISHED BY Corti 


Scores oF +20 AND +30 
7 i ~i er-50 
Under-50 Over50 
2 D 21 DR>20 
Group DRez1 DR>20 DR<2 DRA 
we ma _ 13 
ane Ue correct) (80% correct) 
75% 
prasi DR>30 DRX DR>30 
u 
Ngan 10 13 it 
Sonor Rane 134 14 64 4 


(869% correct) (83% correct) 


one group, providing it leaves a substantial 
number of the other. For instance, a score 
which gave 100% nonorganics correct and 
50% organics, would be of more practical 
value than one which gave 75% of each, al- 
though the total percentage correct is the 
same. In the first case you would be able to 
feel virtual certainty when some scores came 
up, but never in the second case. The whole 
general question of how proportions obtained 
in this way can be used in a practical situa- 
tion, where the size of the populations from 
which you are drawing your samples is un- 
known, is considered in the discussion. 

Table 3 shows the mean values of the DRs 
for the nonorganic patients divided into five 
diagnostic categories, again separated into the 
two age groups. 

The most striking thing about Table 3 is 
that the younger patients consistently give 
higher values than the older ones, while 
within each group, there is little difference 
between diagnoses. This is fairly clear by 
inspection, but a simple analysis of variance 
was carried out for each age group (leaving 
out the schizophrenic in the over-50). In each 
case, the between diagnoses mean square was 
smaller than that within diagnoses. The schiz- 
ophrenics and paranoid states, under-50, were 
combined (one can find reasons why they 
might be combined) and compared with the 
rest of their age group. This gives ¢ = 1.31, 
p> 1. In the older age group, the neurotics 
were compared with the rest; = 1.63, p 
> .1. It will be seen also that the high values 
of the DR are fairly evenly distributed among 
the various groups. 

From the way in which the DR is calcu- 
lated, the expected mean value for any age 
group is nought, if the group is comparable 
with Wechsler’s standardization group. For 
the over-50 group, the mean is very close to 
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TABLE 3 
DETERIORATION RATIO OF NoNoRGANIC Patients By Diacxosis 


Under-50 Over-50 
Group N M Range n>30 N M Range n> 30 
Depressives 30 + 68 +47 to —42 3 40 +0.8 +42 to —52 3 
Schizophrenics 45 +104 +57 to —34 5 == +14 0 
Paranoid states 11 +14.1 +31 to —14 1 8 +0.1 +26 to —36 0 
Neurotics 29 + 81 +46 to —26 4 10 —9.6 +32 to —29 1 
Psychopaths 33 $ Da +49 to —29 1 9 +2.6 +28 to —16 0 
Total 148 +57 to —42 14 68 —0.4 +42 to —52 4 
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nought, but not for the under-50 group; com- 
paring their mean (+8.81) with its standard 
error (1.50) gives ¢ = 5.87, p < .001. 


Discussion 


In differentiating organics from nonorgan- 
ics, the results here seem rather better than 
those obtained in most studies. One reason 
for this is probably that these are not selected 
groups of patients, but patients referred for 
information in the routine way. This means 
that in most cases the diagnosis was in doubt 
at the time of referral, and the deterioration 
was not marked. Gutman (1950), in summing 
up her study, says that hers were definite 
cases and that in less advanced cases even 
poorer differentiation would be expected. In 
general, the principle of taking extreme cases 
is excellent, but in this case it is not appro- 
priate. The theory behind the ratio is that 
certain types of activity deteriorate more 
rapidly than others with aging and other 
forms of damage to the brain. It is not sug- 
gested that the other activities do not dete- 
riorate at all. Clearly with gross dementia all 
abilities, including Vocabulary, disappear, cf. 
Yates (1956). It might be expected that the 
discrepancy would be most clear in the early 
stages, while later on all abilities declined to 
a common level. 

Hall’s (1952) organic cases are described 
as mostly moderately deteriorated, and he 
found only 5 out of 24 with ratios over +29 
and only 7 over +19. His inclusion of Object 
Assembly may partly account for this. The 
group described by McFie and Piercy (1952), 
consisting of patients with localized cerebral 


lesions, gave results fairly close to those of 
the present study. Of 56 DRs calculated, they 
had 24 (43%) over +29 and 37 (66%) over 
+19. For my whole organic group, the equiv- 
alent percentages are 53 and 67. 

An interesting problem is the question of 
what weight can be given to an actual DR in 
a practical situation. If one has obtained a 
ratio greater than 30, what can be said about 
it? In this group, 18 out of 216 nonorganics 
obtained such a score, or 1 in 12; 24 out of 45 
organics did so, rather more than half. So, in 
a sense, it can be said that such a score has a 
good probability of being organic; an organic 
case is much more likely to produce one than 
a nonorganic. However, if one considers the 
distribution of the 42 such scores obtained, 
only 24 are organic, and the probability of a 
given score of over 30 being organic is only 
4/7, not much more than a half. In general, 
this is because the probability of an event’s 
belonging to one of two categories depends 
not only on the relative frequency of the 
event within each category, but also on the 
relative frequency of the two categories in the 
population from which the event is drawn. 
This seems to be a general difficulty in all 
tests of this kind except where a certain score 
is found in virtually 100% of one category. 
The proportions found in one situation cannot 
be used for giving a probability in another 
Situation where the constitution of the popu- 
lation is not known, as in most clinical sit- 
uations, 

The information has to be combined with 
data from other tests, and with qualitative 
indications on the Wechsler itself, for ex- 
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ample, differences on the Similarities (Hall, 
1952) and “rotation” on the Block Designs 
(Shapiro, 1951), which are known to be re- 
lated to the distinction being made. Another 
approach might be to attempt to eliminate 
individuals who come at the extreme of the 
distribution of DRs without having deterio- 
rated. According to Wechsler (1944, p. 66), a 
DR of +20 is 2 Probable Errors from the 
mean, so that in a normal group one would 
expect to find about 9% above +20, and 
about 2% above +30. Subjects with the 
classical type of congenital reading disability, 
in which ordering and spatial organization are 
involved, might be expected to make poor 
Scores on don’t hold tests, especially Digit 
Span, Arithmetic, and Block Designs. There 
are three subjects in the present series with 
reading difficulties marked enough for it to be 
Mentioned as one of their problems. They are 
all in the under-50 nonorganic group, and all 
have DRs greater than +29. It may be noted 
that the present group agrees with Wechsler’s 
in the distribution at the other extreme of the 
Scale. There are 8 minus ratios greater than 
30 (3%) and 21 greater than 20 (870). 

Another thing associated with high DRs is 
low scoring on the test. Hold scores of 4 
and don’t hold scores of 2 do not suggest 
much absolute difference in the abilities Oot 
cerned, but they would give a DR of +50. In 
the nonorganic under-50 group, there are 19 
Patients with IQ less than 80; 6 of these huwe 

Rs over +30, as opposed to 8 out of t 
other 129 with IQ over 79. This is not 0 
much practical help, because low scores 
Occur in organic cases. In the organic cel 
50 group, there are 5 out of 23 with IQ below 
80, and all 5 have DRs over +30. However, 
it does increase one’s confidence in the signifi- 
Cance of a high DR in a person of higher 1Q. 

As far as the under-50 nonorganic group a 
concerned, the study confirms the finding ci 

Ogers (1950) that psychiatric groups a 
a higher DR than normals but do not di er 
among themselves. The over-50 nonorganic 
Stroup, however, does not show more deterio- 
"ation than the normal expectation. This cu- 
rious age difference was also found, in psychi- 
atric patients, by Garfield and Fey (1948). 
“Similar age difference, in the same direction, 
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appears in the organic groups. One explana- 
tion could be that the don’t hold tests are 
affected by either mental illness or aging, but 
the two factors do not have an additive effect 
when they occur together, so that the older 
patient is not affected more than is allowed 
for in the age corrections; alternatively, the 
illness, when combined with greater age, 
might affect the hold as well as don’t hold 
tests. If patients were retested after recovery 
with the equivalent form of the test, on each 
hypothesis the younger ones would be the 
same on the hold tests, and better on the 
don’t hold; on the first hypothesis the older 
patients would be the same on both types of 
test, on the second, they would be better 
on both. 

The fact that mental illness has this tend- 
ency to raise the DR does not destroy the 
utility of the index in the diagnosis of brain 
damage, since the latter raises it significantly 
more. An index of over 30 gives a strong 
initial suggestion of brain damage, especially 
when the IQ is not low. Whatever method of 
assessing deterioration is used, an estimate of 
intelligence is needed in evaluating the re- 
sults, so that Wechsler’s Scale can be used to 


fulfil a double purpose. 


SUMMARY 


A modified form of Wechsler’s Deteriora- 
tion Ratio was calculated retrospectively for 
261 male mental hospital inpatients who had 
been tested routinely over a period of 6 years. 
The scores were compared with the eventual 
diagnoses, and it was found that the ratio 
distinguished organic and nonorganic cases 
fairly well. In the nonorganic group, there 
was no variation corresponding to diagnosis, 
but there was confirmation of previous find- 
ings that psychiatric patients give larger 
ratios than normals. This, however, applied 
only to the younger group (under-50). 
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_ 
* VOCABULARY DEFICIT IN BRAIN OPERATED 
SCHIZOPHRENICS 
Ai ROY M. HAMLIN? 
University of Pittsburgh School of Medicine 
anp ELAINE F. KINDER 
| Rockland State Hospital, Orangeburg, New York 
A Traditionally, vocabulary has been re- This factor of test format, and of the 
garded as one of the functions affected least testing procedure, has received little attention 
and latest by impairment due either to in studies of deficit following brain damage. 
schizophrenia or to brain damage. In a recent The Williams et al. study does comment on 
study, Smith and Kinder (1959; Smith, the test used. These authors base their con- 
1958) report no significant loss in vocabulary clusions on the Reading and Vocabulary 
eN for topectomized schizophrenics as compared (RV) Test from the Army Classification 
~ with controls, 8 years after surgery. On a Battery (ACB). They offer evidence for the 
variety of other tests the brain operated essential equivalence of the RV test and the 
subjects did show significant losses. Smith Wechsler oral Vocabulary test, citing correla- 
has suggested that significant losses tend to tions of .76 between these two tests for Army 
appear more frequently on measures involving recruits, for enlisted men, and for brain 
sustained attention and perceptual organiza- injured patients. ‘They also note that the 
A tion than on measures involving vocabulary “majority of the items in RV require the S 
and verbal skills. a to read ig agi a are hea Only 
ini’ A ieseking (19 a minority are standard vocabulary items. 
N Williams, Lubin, and ‘Giseking ( ai Although the present study limits con- 


report seemingly opposite results for 
schizophrenic subjects with a variety of 
conditions: vocabulary and verbal skills not 


brain 


i only show clearly significant losses, but 
R furthermore the amount of loss is as much, 
or even more, than on spatial tests. Subject 
Samples, site and nature of tissue damage, 

and the 


lapse of time following the lesions, 
type of vocabulary test used raise many ques- 
* tions, The present study focuses on the last 
of these three factors: what is the effect of 
an oral test format, as compared with a 
multiple-choice format, on vocabulary scores 


following brain surgery? 


© 


‘Now at the Veterans Administration Hospital, 
@nville, Illinois. The senior author's parnana in 
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Follow-Up Project with a research grant (M-1191) 
rom the National Institute of Mental Health and 
Unds and assistance from the Research Foundation 

for Mental Health, Inc. of the New York State 


partment of Mental Hygiene. 


sideration to test procedure, some comment 
on the general and often contradictory evi- 
dence for vocabulary deficit following brain 
damage should be offered. Two of the most 
relevant and adequate studies are those of 
Yates (1956) and of Weinstein and Teuber 
(1957). Yates, on the basis of an extensive 
review, concludes that “vocabulary does 
decline in patients suffering from brain- 
damage.” This general conclusion may be 
supplemented by Weinstein and Teuber’s 
careful consideration of differential effects 
when lesions involve specific brain areas. 
With pretest and posttest scores for both 
injured and control subjects, these latter 
authors report obvious loss in verbal and 
vocabulary scores some 10 years after focal 
lesions involving the left parietal-temporal 
areas, and/or with aphasia as a symptom. 
For other focal lesions, including frontal lobe 
lesions, they report little or no loss in vocab- 
ulary and similar tests. 
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In regard to the more immediate question 
of test format and test procedure, vocabulary 
deficit may well be in part a function of the 
vocabulary test used. In the previously cited 
study, Weinstein and Teuber (1957) found 
little verbal loss in certain brain injured 
groups; but these same subjects showed 
obvious loss on a hidden picture test. Is the 
multiple-choice test something like a hidden 
picture test, with the correct answer word 
“perceptually hidden” to some degree among 
three or four other words? Again, in the 
Smith and Kinder (1959) study, the Capps 
Homograph Test showed significant loss al- 
though oral vocabulary did not. The Homo- 
graph Test is a vocabulary test which intro- 
duces an additional element, requiring the 
subject to make a conceptual shift from one 
definition to another for the same word. 

The present study used three test formats 
with subjects from the same sample studied 
by Smith and Kinder (1959) : one oral, and 
two multiple-choice, tests of vocabulary. The 
first multiple-choice procedure maximized the 
factor of sustained attention. The second 
multiple-choice procedure minimized the fac- 
tor of attention. The element of perceptual 
organization was assumed to be inherent in 
both multiple-choice procedures. 

Since the subjects had taken Wechsler oral 
Vocabulary tests before Surgery and again 
8 years after Surgery, the chief interest in the 
present study was in the new multiple-choice 
procedure which was designed to require both 
Sustained attention and perceptual organiza- 
tion. The same words used in this multiple- 
choice procedure were then given orally and 
again under modified multiple-choice condi- 
tions to determine the comparability of these 
words to the Wechsler words, and to explore 
the extent to which each subject could achieve 
an approximate definition under any condi- 
tions. Three hypotheses were considered: 


3 1. Topectomized schizophrenics will show 
significant loss on a multiple-choice test of 
vocabulary which maximizes the importance 
of sustained attention. 

2. Topectomized schizophrenics will show 
no significant loss when the same words are 
then given as an oral vocabulary test. Smith 
and Kinder (1959) had already reported no 
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loss in oral vocabulary for these subjects § 
years after surgery. The present oral test was 
designed to insure that the specific word list 


used did not account for a difference in 


results. 

3. Topectomized schizophrenics will show 
a tendency to loss when the same words are 
given a third time in a multiple-choice pro- 
cedure minimizing the importance of atten- 
tion. The number of correct definitions was 
expected to be greater than on the first 
multiple-choice procedure and less than on 
the oral procedure. 


METHOD 
Subjects. The subjects and surgical procedures 
have been described in detail elsewhere (Lewis, 


Landis, & King, 1956; Smith & Kinder, 1959). The 
40 subjects included in the present study were all 
patients at Rockland State Hospital, Orangeburg, 
New York. Of these 21 were operated subjects, and 
19 nonoperated controls. All were from the New 
York State Brain Research Project, had been diag- 
nosed as schizophrenic, and had been included in 
the study reported by Smith and Kinder. i 

The operations had been performed 10 years prior 
to the present study. The subjects had been origi- 
nally divided into a C group of older subjects (age 
range 47-58 as of 1949) and a D group of younger 
subjects (age range 21-38 in 1949). Surgery con- 
sisted of an orbital or a superior topectomy, both 
bilateral frontal lobe operations. J 

Thirty-eight subjects were assigned to 7, Ohi 
of operated and control patients, matched for ed 
operative Wechsler Vocabulary. In 15 pairs, H x is 
possible to match the subjects by chan ea eae 
group. In the other 4 pairs, they were matched for 
Vocabulary but not for age. ý f 

Tests. In addition to the Wechsler Vocabulary 
before and 8 years after surgery, the 40 subjects 
were given three new vocabulary tests 10 years 
after surgery. The five tests, with the designation 
used in Table 1, were as follows: 


1. Pre: The preoperative vocabulary score is the 
average of two Wechsler oral Vocabulary raw scores 
obtained within a period of a few months before 
surgery in 1949. On occasion, only one Wechsler 
was given before surgery, rather than the usual twos 

2. Post-8: The average of two Wechsler oa 
Vocabulary tests given within a 30-day interva 
8 years after surgery. A 

3. Oral: The 60 words in the first civilian edition 
of the Army General Classification Test (AGCT, 
1947) were given orally and scored as in the 
Wechsler, 

4. MC-1: The first multiple-choice procedure 
consisted of the same 60 items from the AGCT. 
The stem phrase was presented on a card for > 
seconds: for example, “BIG means the same as. . - 


gi 


a 
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The stem card was then removed, and a 10-second 
delay followed. The four answer words from the 
AGCT were then presented, without the stem card, 
and the subject marked his response. The delay 
Procedure emphasized the necessity for sustained 
attention, 

5. MC-2: The second multiple-choice procedure 
employed the same materials. The four answer 
words from the AGCT were presented first, and the 
subject read them aloud. Mistakes in reading were 
corrected. The stem card was then presented and 
read aloud by the examiner, with the answer words 
still in view. The subject then gave his answer 
orally. At every step, the examiner encouraged 
maximum attention to each element in the task. 
This procedure minimized the effect of variable 
attention, 


F In the Oral, MC-1, and MC-2 procedures, a 
“base” and “ceiling?” were used where obviously 
indicated; subjects responded to an average of some 
45 of the 60 items. A correction for chance suc- 
cesses was used. The order of the tests was: MC-1, 
Oral, MC-2, with MC-2 given in a second session. 
The advantages and disadvantages of a rotated order 
for the tests were considered. With the number of 
Subjects available and the anticipated variability of 
Psychotic behaviors, it was felt that exactly the same 
Procedure should be used with all subjects, in order 
not to lose the precision in analysis offered by pair 
Matching. As previously indicated, the oral and 
Second multiple-choice procedures were designed as 
Control observations. The unexpected results with 
the second multiple-choice procedure will be dis- 
cussed later, 

Analysis, Results were analyze 
and by ¢ tests for differences r 
Means, p values refer to a one-tailed test.” 


RESULTS 


Table 1 presents the res 
Vocabulary tests in terms 0 3 
controls surpassing topectomized patients, for 

9 pairs of subjects matched on preoperative 
Wechsler Vocabulary. Significant values for 
the sign test are indicated. The results will 
9€ presented under the following headings: 
MCu, Oral, MC-2, Wechsler, and the three 
hypotheses, ; 
_ MC-1, The first multiple-choice procedure 
incorporated both attention and perceptual 
°'ganization, as well as word knowledge. 3 
Was expected that this procedure might a 
erentiate the topectomized and contro 
* Ardie Lubin, and Elizabeth Engle of the Walter 


Re k ; re provided the 
ae Army Institute of Research hav e p pe 


hors with detailed statisti 


d by the sign test, 
between correlated 


ults for the five 
f the number of 


TABLE 1 


NUMBER OF CONTROLS SURPASSING TOPECTOMIZED 
PATIENTS ON FIVE VOCABULARY SCORES, FOR NINE- 
TEEN Parrs oF SUBJECTS MATCHED ON PRE- 

OPERATIVE WECHSLER VOCABULARY 


Pre Post-8 MC-1 Oral MC-2 
Control Higher 8 14* 12 13* 16%* 
Ties 1 0 2 1 0 
Note.—Pre | =Preoperative Wechsler (1949). 
‘ost-8 =Wechsler (19. 


). 
ultiple-choice, with delay (1959), 
ew oral vocabulary (1959), 
Multiple-choice, no delay (1959). 


groups. It failed to do so. Neither the sign 
test (Table 1) nor the difference between 
means (¢ = 1.41) is significant. As with every 
vocabulary test given after surgery, the con- 
sistent tendency for the controls to surpass 
the operates is apparent. The mean superi- 
ority of the controls is 4.3 words. 

Oral. Since Smith and Kinder (1959) 
reported no significant difference in oral 
vocabulary for these subjects, it was expected 
that the new Oral vocabulary test would not 
differentiate the topectomized and control 
groups. This expectation was further encour- 
aged by the high correlation between the new 
Oral and Wechsler oral given 2 years earlier: 
for 41 subjects, this correlation was .89. For 
19 pairs matched on preoperative vocabulary, 
13 pairs show Oral scores higher for the 
control subject, 5 higher for the operate, and 
1 tie. The sign test is significant at the .05 
level, suggesting some loss in oral vocabulary 
due to brain surgery. The ¢ for the difference 
between means, however, is 1.29 and not 
significant. The average superiority of the 
control over the operate is 2.3 words. 

MC-2. The second multiple-choice pro- 
cedure minimized the importance of sustained 
attention but retained the factor of perceptual 
organization. This vocabulary test format 
yielded a clearly significant difference associ- 
ated with brain surgery. Of 19 controls, 16 
surpassed the topectomized subject with 
whom they were paired. The sign test is sig- 
nificant at the .01 level. The ¢ for the dif- 
ference between means is 3.21, also significant 
at the .01 level. The mean superiority of the 
controls over the operates is 6.3 words— 
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nearly three times as much as for the same 
words given orally.* 

The serial order in which the tests were 
given could be important: MC-1, Oral, MC-2. 
The control subjects showed a consistent in- 
crease in mean score on each later procedure: 
30.9, 35.6, and 38.1. The topectomized sub- 
jects obtained the following mean scores on 
the respective tests in order of presentation : 
26.6, 33.3, and 31.8. 

Wechsler. Smith and Kinder, using analysis 
of covariance, report no significant loss in 
Wechsler oral Vocabulary 8 years after 
Surgery for these subjects. Pair matching, 
however, indicates significant loss. For 19 
pairs, the control surpasses the operate in 
14 cases. The sign test is significant at the 
‘OS level. The ¢ for the difference between 
means is 2.46, also significant at the .05 
level. 


To confirm 


for loss in 


level. The sign test j 
at the .01 level (p= 
loss due to brain Surgery is 2.6 words. 
Hypotheses, 
potheses follow: 


1. The topectomized schizophrenics do not 
show a significant degree of relative loss on 
the multiple-choice vocabulary test requiring 

_ Sustained attention and perceptual 
organization. As with all the vocabulary tests 
given after Surgery, the consistent tendency 


for the controls to s i 
urpass the operat 
suggested. a = 


3 The gain from Oral to MC-2 
of subjects, was greater for the 
for the operate 3 times, with 1 tie. This manipula- 
tion of scores has questionable justification, but does 
suggest the effect of different test formats on the 
same words. 


» for the 19 pairs 
control 15 times, 
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2. The topectomized subjects do show 
significant loss when the same words are then 
given orally. The sign test for the new Oral 
vocabulary indicates a difference in favor of 
the controls at the .05 level of confidence. 
The superiority of the controls need not be 
associated with the serial order in which the 
tests were given in the present study, since 
results for the Wechsler oral Vocabulary, 
given 2 years earlier, are comparable. Both 
the sign test and difference between means is 
significant at the .05 level. 

3. The multiple-choice test which mini- 
mized the necessity for sustained attention 
clearly differentiates the operates from the 
controls. This test was given last in all cases. 
The topectomized schizophrenics show more 
loss, and more significant loss, in vocabulary 
on this test than on any of the others. The 
differential effect of “test procedure” is indi- 
cated, but both test format and order of 
presentation may contribute to the result. 


Discussion 


The present study considered specifically 
the effect of test format on vocabulary scores 
after brain surgery. The results do suggest 
that test procedure may be an important 
factor. However, another factor in test pro- 
cedure, the order of test presentation, may 
influence the results as much as the test 
format. This Discussion will consider briefly 
the general results of the study, and will then 
comment on the special question of test 
procedure. 

The evidence for some loss in vocabulary 
in topectomized patients, as compared with 
controls, 10 years after surgery is not sur- 
prising. The vocabulary losses reported here 
are small in terms of the instruments’ pre- 
cision: the relative loss in Wechsler Vocabu- 
lary is some 2.5 words. The number of 
studies, employing comparable controls, with 
which these results can be compared is ex- 
ceedingly small. Only a few reports present: 
control subjects acceptably matched with 
experimental subjects on the basis of tests 
before brain damage; comparable pretest and 
posttest scores, especially after the lapse of 
a number of years; and any reasonably ade- 
quate evidence for the location of the lesion. 
Perhaps the most nearly comparable study is 
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that of Weinstein and Teuber (1957) previ- 
ously cited. These authors reported marked 
loss on verbal and vocabulary tests some 10 
years after certain focal brain injuries, but 
little or no loss on such tests when only the 
frontal lobes were involved. The table which 
they present does indicate some tendency to 
relative loss for every injury group con- 
sidered, whatever area of the brain was in- 
volved. That is, some injury groups gained 
Over their preinjury scores, but the gain was 
never as great as the gain made by the con- 
trol subjects. The present results are similar: 
slight, but in this case sigiificant, loss in 
vocabulary 8 and 10 years after topectomies 
involving frontal lobe areas. Since Weinstein 
and Teuber (1957) report on a group test, 
on nonpsychotic subjects, and on injuries 
rather than bilateral operations, their study 
is only roughly comparable to the present 
One. The striking loss in verbal skills reported 
by Williams et al. (1959) must be evaluated 
in light of the heterogeneous brain conditions 
involved and the relatively short length of 
time following recovery. The evidence sug- 
gests that vocabulary deficit some 10 years 
after limited frontal lobe lesions is not great, 
but may be statistically significant under 
Conditions such as those of the present study. 

The effect of a specific test procedure em- 
Ployed in measuring vocabulary deficit has 
een given only incidental attention in previ- 
ous studies. In one sense, the present results 
are clear, These topectomized subjects tend 
to show some deficit in vocabulary on every 
Vocabulary test given 8 or 10 years after 
Surgery, The effects of surgery are most 
Clearly apparent, however, on the multiple- 
Choice procedure MC-2, which was given last 
in order of testing. The effect of test proce- 
dure is indicated, but either test format = 
Order of test presentation could account tor 


© results. 
y t 
ly, the effect of orma 
the present study, i i : 


and of order may be closely ; 
SUggestion calls je final word on the special 
Problem of demonstrating effects due o 
Surgery when both experimental and contro 
Subjects are markedly psychotic. For such a 
Purpose, the most suitable test may be one 
Which minimizes the capricious and unpre- 
dictable vagaries of schizophrenic behavior, 
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and yet retains some element related to brain . 
damage. The MC-2 procedure may have been 
most effective for this reason, Each element 
in the task was presented separately, and 
every effort was made to insure active at- 
tention. Reading errors were corrected, and 
the subject reviewed each correction. When 
the crucial answer was finally permitted, the 
schizophrenic patient’s sporadic or “func- 
tional” inattention to reality may have been 
largely overcome. The resulting optimal 
capacity to define words and deal with per- 
ceptual organization could then constitute a 
measure which would demonstrate the effects 
of surgery. The importance of simplifying 
elements, insuring attention, and using re- 
peated measures should be considered. 
Paradoxically, such precautions might result 
in a measure of some aspect of “attention” 
which would reflect the effects of brain 
surgery. 

The most discriminative test, MC-2, was 
given /ast to all subjects, after they had had 
repeated exposure to the words involved. Of 
the two oral tests, the one with which the 
subjects had had previous experience dis- 
criminated best. Sheer’s (cf. Lewis, Landis, 
& King, 1956) astute observations on “prac- 
tice gains” are related to the importance of 
driving each element in the task home before 
expecting schizophrenic subjects to respond 
in a sufficiently comparable manner to per- 
mit measurement of differences. That is, 
familiarity with a test may serve to decrease 
unpredictable and irrelevant schizophrenic 
behaviors. Unless the idiosyncratic capri- 
ciousness of schizophrenic behavior is taken 
into account, relevant differences may be 


readily obscured. 


SUMMARY 


The study considered the effect of test pro- 
cedure on measures of vocabulary deficit 8 
and 10 years after bilateral topectomy. Three 
vocabulary tests, each employing the same 
words, were given to 21 operates and 19 
controls. The three tests were a multiple- 
choice test maximizing the necessity for sus- 
tained attention, an oral test, and a multiple- 
choice test minimizing the importance of 
attention. The results and conclusions follow: 
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1. The topectomized schizophrenics showed 
a consistent tendency to vocabulary deficit on 
all tests, either oral or multiple-choice. The 
loss was statistically significant in most cases, 
but slight: for example, a mean loss of some 
two and a half words in Wechsler Vocabulary. 

2. Vocabulary deficit was greater, and 
most significant, on the multiple-choice pro- 
cedure which minimized attention, and which 
was given after the other two tests. This pro- 
cedure differentiated the operates and con- 
trols at the .01 level of confidence. The mean 
superiority of the controls over the operates 
was nearly three times as many words as for 
the same words given orally. 

3. The results suggest that the test pro- 
cedure used may be an important factor in 
studies of vocabulary deficit. Relevant con- 
siderations include both the test format and 
the order of presentation. Both format and 


order may contribute to the results of the 
present study. 
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AN EVALUATION OF THE NORTHWESTERN INFANT 
INTELLIGENCE TEST, TEST B 


BERNARD B. BRAEN 
Onondaga County Child Guidance Center 


The Northwestern Infant Intelligence Test, 
Test B (Gilliland, 1951) contains 40 items 
and is designed for use with infants between 
the ages of 13 and 36 weeks. Preliminary sta- 
tistical work with the test (Gilliland, 1951, 
P. 16) suggests that it may be a more suit- 
able instrument for the assessment of infant 
intelligence than other tests for this age pe- 
riod. This paper attempts to provide a sys- 
tematic assessment of reliability and validity 
of the test as well as other quantitative and 
qualitative considerations. 


METHOD 


The subjects consisted of 100 adoptive or boarding 
home babies of both sexes between the ages of 13 and 
36 weeks inclusive? In the first phase of the study 
the Cattell Infant Intelligence Test (Cattell, 1947) 
Was administered to the baby by a qualified ex- 
aminer.? The second phase occurred 3 days later 
When the Northwestern Infant Intelligence Test, 
Test B was administered to the baby by a different 
examiner. The third phase involved ual 
tration of the Cattell by the original examiner * when 
Ry baby was 18 months of age.5 
E eiaa 

1 The Child and Family Servic 
ment of Public Welfare, Children’s 
Syracuse, New York were the part 
An the study. “a i; 
woe seer is extende oo Roths 
child fo ial part in this p! A 

Avot cay agian all of the Northwestern 
tests, 

* After 45 retests the original e: 
pity of the clinic. The author a 

aining 19 tests. x 

i he decision to retest the babies at 18 aes 
With the Cattell was determined by two Bere o 
Si € correlation coefficient between vee as 

©nths and the Sten ee This coefficient sug- 
2 months appears to be 
factors as the Stanford- 
lts of this study were 
garding the 


e and the Depart- 
Division, both of 
icipating agencies 


xaminer left the em- 
dministered the re- 


asuring the same factor or 
inet at 3 years. (b) The resu 
© have definite practical implications re 


A frequency distribution of age by week for the 
babies administered the Northwestern is shown in 
Table 1. The mean age was 22.56 weeks with a stand- 
ard deviation of 6.71 weeks. 


RESULTS 


The mean IQ for the Northwestern was 
91.42 with a standard deviation of 9.54. The 
mean IQ for the Cattell was 112.00 with a 


TABLE 1 


FREQUENCY DISTRIBUTION OF AGE BY WEEKS FOR 
NORTHWESTERN-TEsT B SUBJECTS 


(N = 100) 


k] 


Age in Weeks 


w 
= 
SCUWWNWERNNOHEBEARWNHOCUBRUAKROAD 


adoption testing program at the Child Guidance Cen- 
ter. For this reason it was important to gather vali- 
dating data with expediency but without sacrificing 
rigor. 
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TABLE 2 


MEANS AND STANDARD DEVIATIONS FOR NORTHWESTERN AND CATTELL STANDARD 
Score IQs ror Four AGE Groups AND TotaL Group 


Northwestern Cattell 
Age in Mean 
Weeks N M SD M SD Difference t 

> 

13-16 24 99.50 6.34 103.17 12.19 —3.67 —1.69 “ 
17-21 25 97.60 9.52 93.96 9.45 3.64 2.18* 
22-27 27 103.89 8.35 102.59 8.87 1.30 83 
28-35 24 100.38 11.94 104.88 10.43 —4.50 —1.78 
13-36 100 100.42 9.54 100.00 10.19 .42 .67 


* Significant at the .05 level. 


standard deviation of 15.21. On the basis of 
these findings it appears that the unequal 
means and standard deviations obtained with 
these tests make any direct comparison of IQs 
impossible. Further, without some sophistica- 
tion with the concept of variability there may 
be a tendency to misinterpret the meaning of 
the IQ obtained with either test. In order to 
make comparisons between the two sets of 
Scores, all the IQs on the Northwestern and 
Cattell were converted to standard scores with 
a mean of 100 and a standard deviation of 10. 
Also the 100 babies were divided into four age 
groups with approximately an equal number 
of subjects in each age group. 

i The converted means and standard devia- 
tions for the four age groups and the total 
Sroup appear in Table 2. 

From Table 2 it appears that variability of 
IQ from one test to the other occurs in all 
four age periods but the ¢ tests reveal that it 
1s most pronounced at the 13-16, 17-21, and 
28-35 week levels. The ¢ for the 28-35 week 
old group is artificially elevated due to the 
low ceiling on the Northwestern. By the IQ 
calculation method described by Gilliland 
(1951, p. 14) an infant of 36 weeks can only 
achieve an IQ of 112 on the Northwestern if 
he passes all 40 items. An infant of 35 weeks 
can only achieve an IQ of 115 if he Passes all 
40 items, and so on. Such a state of affairs 
could serve to lower artificially the mean IQ 
on the Northwestern for the 28-35 week old 
group, which would then contribute to a de- 
ceptive difference between the means of the 
Northwestern and Cattell. 

The means and standard deviations of the 
difference between the Northwestern and Cat- 


riods are reported in Table 3. 

It appears from these data that the mean 
difference in standard scores for each group 
and total age period is no less than 6 and no 
more than 10 units. However, the andard 
deviations for each age period suggestithat in- 
dividual infants can vary from no difference 


tell standard score IQs for the four 7 pe- 


in relative position on the two tests to a dif- 


ference of as much as 25 standard score units 
from one test to the other. Since only 3 days 
intervened between the Northwestern and Cat- 
tell testings, these data indicate that at this 
age level, examiner, subject, and test reliabil- 
ity are difficult to achieve. 


Reliability ; 
The reliability of the Northwestern forthe 


100 babies 13-36 weeks of age was Soe 
through the odd-even method. The resuitins 


S n: 
coefficient corrected by the Spearman Brow: 


formula was .95 = .01. ae > 
Table 4 shows the odd-even reliability cO 

efficients for the four age periods. = 
Gilliland (1951, p. 16) reported a correc 5 

odd-even coefficient of .80 computed fror 


TABLE 3 


p DIFFER- 
MEANS AND STANDARD DEVIATIONS OF THE e 
ENCES BETWEEN NORTHWESTERN AND ETA 
STANDARD SCORE IQs ror Four AGE PER 


Age in 

Weeks N Range M SD pe 
13-16 24 1-25 8.88 6.70 
17-21 25 0-22 7.00 A 
22-27 27 0-22 6.26 54! 

28-35 24 0-22 9.38 6.44 


tee ee 


— S y 


A four age groups may i 


Northwestern Infant Intelligence Test 247 


TABLE 4 


NORTHWESTERN Opp-Even RELIABILITY COEFFI- 
CIENTS FOR FOUR AGE PERIODS 


Agein 

Weeks N r SE 
24 64 £.09 
25 .76 +.05 
27 S88 +.02 
24 89 +.02 


data obtained from 214 babies between 13 
and 36 weeks of age from the Chicago area. 
Cattell (1947, p. 49) reports odd-even coeffi- 
cients for her test for age periods comparable 
to the Northwestern age levels. At the very 
early age levels (13 weeks and 3 months) 
both the Northwestern and Cattell are less 
reliable than at the later levels, but in rela- 
tion to the Cattell at the early age period the 
Northwestern appears the more reliable in- 
strument. 

In spite of the fact that the Northwestern 
permits a reliable representation of particular 
skills during the testing session, the obtained 
coefficients do not give information regarding 
consistency of performance over time. It would 
appear then that the reliability of the test for 
Predictive purposes is limited. 

The reliability coefficients for each of the 
n themselves be unre- 
liable because of the small number of subjects 
at each level. However, since the progression 
of coefficients followed closely those obtained 
with the Cattell it appeared that this factor 
Was not significantly affecting the resulting 


Coefficients. 


Correlational Analysis 


The correlation coefficients between the 
Northwestern, Test B and the Cattell for the 
four age groups and the total group are re- 
Ported in Table 5. 

These coefficients suggest 
clement of soina in the tests and that 
this communal aspect is apparent for all ages. 

he magnitude of the correlation coefficients 
remains about the same for each age period 
even though the reliability of both tests in- 
Creases with age. Such a finding suggests that 


that there is an 


the tests have more in common at the 13-21 
week period than the 22-36 week period. 

The correlation coefficient of .58 between 
the Northwestern and Cattell for the whole 
age range may not reflect realistically the re- 
lationship between performance on the two 
tests because of the low ceiling on the North- 
western already described. In order to evalu- 
ate the effects of this artifact, the correlation 
was redone with the 28-35 week old group 
eliminated. The resulting coefficient was .62 
=Æ .05. It does not appear that the elimina- 
tion of the oldest group markedly affects the 
extent of the relation between performance 
on the two tests. 


Sex Differences 

A critical ratio was computed between the 
means for 61 males and 39 females on the 
Northwestern. The resulting ratio of 1.55 in- 
dicates that the obtained differences between 
the means of the sexes could be accounted for 


by chance. 
Validity 

Validity was assessed by correlating per- 
formance of 64 subjects on both the Cattell 
and Northwestern at 13 to 36 weeks with 
their performance at 18 months of age on the 
Cattell. Table 6 shows the means and stand- 
ard deviations for the Northwestern and Cat- 
tell IQs at 13 to 36 weeks and the Cattell IQs 
at 18 months. 

The correlation coefficient between the 
Northwestern IQs at 13 to 36 weeks and the 
Cattell at 18 months was .38 = .07. The co- 
efficient between the Cattell at 13 to 36 weeks 
and the Cattell at 18 months was 39 = .07. 

Both coefficients are low enough to indicate 


TABLE 5 


CORRELATION COEFFICIENTS BETWEEN NORTHWESTERN 
AND CATTELL IQs FOR Four AGE GROUPS AND TOTAL 
GROUP 


Age in 

Weeks N r SE 
13-16 24 51 +.10 
17-21 25 63 .08 
22-27 7 54 4.09 
28-35 24 42 +.11 
13-36 100 58 +.04 
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TABLE 6 


MEANS AND STANDARD DEviaTIONS FOR NORTH- 
WESTERN AND CATTELL IQs at 13-36 WEEKS AND 
CATTELL IQs at 18 Montus 


Test N M SD 
Northwestern (13-36 weeks) 64 90.91 9.38 
Cattell (13-36 weeks) 64 111.92 16.91 
Cattell (18 months) 64 110.58 10.65 


that little faith can be placed in predictive 
statements regarding intelligence for the 13 
to 36 weeks age group. In fact only about 
14% of the variance on the Cattell at 18 
months can be accounted for by variation in 
the Northwestern and Cattell IQs at 13 to 36 
weeks. At the 13 to 36 week period, factors 
such as rapid and varying growth rates, em- 
phasis on sensorimotor skills, varying motiva- 
tion, relatively subjective administration and 
scoring procedures, and examiner unreliability 
all seem pertinent sources of uncontrolled 
variance that serve to reduce the predictive 
power of both the Northwestern and Cattell. 


SUMMARY 


This study was designed to investigate the 
reliability, validity, and certain other features 
of the Northwestern Infant Intelligence Test, 
Test ‘B. One hundred adoptive or boarding 
home babies between the ages of 13 and 36 
weeks were tested with the Cattell. Three days 
later the Northwestern was administered. The 
Cattell was readministered to 64 of the 100 
babies when they were 18 months of age. 


Bernard B. Braen 


The general findings were: 


1. The IQs obtained on the Northwestern « 


and Cattell were not directly comparable due 
to unequal means and standard deviations. 

2. The Northwestern has a low ceiling at 
the upper age levels which prevents full ex- 
pression of an infant’s developmental skills 
between the ages of 30 and 36 weeks. 

3. Odd-even reliability for the Northwest- 
ern and Cattell is similar for the 13 to 36 
week age period but the reliability for the 
Northwestern is somewhat better than the 
Cattell at the 13-16 week period. The reli- 
ability of both tests improved with increasing 
age. 

4. There appears to be a common factor in 
both tests at the 13 to 36 week period. This 
is especially true at the 13-21 week level— 
probably due to a preponderance of sensori- 
motor items on both tests for this age. 

5. There was no significant sex difference 
found on the Northwestern. 

6. The validity results indicate that predic- 
tion of a child’s intelligence level at 18 months 
is a very risky procedure when based on per- 
formance on the Northwestern and Cattell at 
the 13-36 week period. 
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Several recent investigations have demon- 
strated the susceptibility of the projective 
techniques to situational influences (Masling, 
1960). The arousal of temporary motivational 
states (Mussen & Scodel, 1955), the relation- 
ship between the subject and the examiner 
(Bernstein, 1956), the subject's attitude to- 
ward the test (Feldman & Graley, 1954), and 
the manner in which the test is defined (Henry 
à & Rotter, 1956), produce identifiable changes 
-in projective test response. These studies usu- 
| ally end at the point of measuring changes 1n 
| the protocols; the implication seems to be that 

the modified protocols would influence subse- 
quent clinical judgment. 
The clinical psychologist 
a setting in which practical deci 
great consequences for the individua 
the community must be made. Often, the res 
ferrer requests an opinion as to an individual's 
“adjustment.” Since the question rarely speci- 
| fies a criterion situation for the judgment and 
Since the various objective measures of ad- 
justment are fairly specific (Tindall, 1955), 
the judgment is frequently based on a global 
estimate of intellectual and/or “emotional 
Controls, Psychoanalytic formulations continue 
to influence much of clinical practice; thus 
emotional control often is taken to mean con- 
trol of sexual and aggressive drives, and i 
ferences about psychological status are base 
“Pon the degree and manner 17 which drive 
erivatives enter consciousness and behavior. 
he purpose of this investigation is to a 
quire into some of the determinants of globa 
Judgments of adjustment. Specifically, the ef- 
fect of TAT stories which have been elicited 


usually works in 
1 decisions having 
l and for 


1 I wish to thank James Norton for assistance with 
Problems of design and analysis. 


JUDGMENTS OF ADJUSTMENT FROM TAT STORIES 
AS A FUNCTION OF EXPERIMENTALLY 
ALTERED SETS 


BERNARD LUBIN? 
Indiana University Medical Center 


under different instructions are judged for 
level of adjustment by clinical psychologists. 
Stories elicited under “Facilitating” instruc- 
tions (prestige suggestion which placed high 
value on spontaneity and individuality) have 
already been shown to contain a higher amount 
of sexual and aggressive expression than sto- 
ries elicited under “Neutral” or “Inhibiting” 
(prestige suggestion which placed high value 
on constraint and conformity) instructions 
(Lubin, 1960). The expectation is, therefore, 
that stories from the Facilitating condition 
will be judged as lower in adjustment than 
stories from either the Neutral or the In- 


hibited condition. 


METHOD 


In a previous investigation (Lubin, 1960), a ran- 
dom sample of 60 male college freshmen was tested 
in a 3X2 covariance design. The five conditions 
were: two Card conditions (two TAT cards with 
high pull of sexual content and two TAT cards with 
high pull of aggressive content), and three levels of 
Instructional Set (Inhibiting, Neutral, and Facilitat- 
ing). Set was produced by means of prestige sug- 
gestion: the Facilitated group was told that “nor- 
mal,” “well-adjusted” people tend to let their im- 
agination go as it is stimulated by the cards; the 
Inhibited group was told that normal, well-adjusted 
people are the master of their imagination and emo- 
tions; and the Neutral group was given innocuous 
instructions.? Subjects were randomly assigned to 
conditions, 10 subjects to a condition. Analyses indi- 
cated that Set produced a significant effect on sexual 
and on aggressive expression, and that the interac- 
tion between Set and Cards produced a significant 
effect on aggressive expression. 

2A more detailed description of the methodology 
including the complete instructions can be found in 
the report of the previous investigation (Lubin, 


1960). 
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Stimuli 


The two experimentally treated TAT stories pro- 
duced by each of the 60 subjects were stapled to- 
gether. The stories were verbatim transcriptions and 
contained a notation as to reaction time. Within each 
envelope, the 120 stories (60 pairs) were divided into 
two sets of 30 pairs each, 60 stories told to TAT 
Cards 2 and 10, and 60 stories told to Cards SBM 
and 20. The paired stories within each set were thor- 
oughly scrambled. 


Judges 


Ten clinical psychologists, seven of whom pos- 
sessed the PhD in clinical psychology and three who 
had a master’s degree and at least 2 years of addi- 
tional clinical experience, participated as judges. 


Procedure 


In the instructions, the judges were told that the 
stories were produced by 60 male subjects between 
the ages of 18 and 23. They were informed as to the 
TAT cards which had elicited the stories, and they 
were requested to rank the four major cues which 
they had used in rating the stories. Each judge 
received an envelope whose contents are described 
above. In addition, each envelope contained the fol- 
lowing seven-point scale together with detailed in- 
structions for its use.3 

i. 


5 Well-integrated, happy person, socially effective 


Only mild problems in essentially well-function- 

ing person 

Particular problems of some difficulty but social 

effectiveness maintained 

Discomfort from problems severe enough to re- 

_ quire therapy but ability to carry on 

5. Acute neurotic problems but 
tenuously preserved 

6. Severe neurotic problems with disorganization 

7. Severe disturbance bordering on psychotic or 
prepsychotic ; 


3. 


4. 


reality contact 


The seven-point scale was adapted from one de- 
veloped and used by Dymond (1954) in an investi- 
gation of the effects of psychotherapy, Repeat scor- 
ing reliability of this scale was found to be .94, 

The two stories of each subject were assigned one 
rating by each of the 10 judges. Analysis of variance 
was conducted on the 600 ratings of adjustment. 


RESULTS AND DISCUSSION 


It is important to note that the validity or 
accuracy of the ratings was not a subject of 


3 A copy of the complete instructions has been de- 
posited with the American Documentation Institute, 
Order Document No. 6631 from ADI Auxiliary Pub- 
lications Project, Photoduplication Service, Library 
of Congress; Washington 25, D. C., remitting in ad- 
vance $1.25 for microfilm or $1.25 for photocopies. 
Make checks payable to: Chief, Photoduplication 
Service, Library of Congress. 
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investigation. The stories had been produced 
by a sample of 60 male freshmen chosen ran- 
domly from the total population of male 
freshmen of a large university. It can be as- 
sumed that these subjects represented a sam- 
ple of relatively well-functioning individuals. 
In order to preclude a skewed distribution of 
ratings, the judges were not told that the sub- 
jects were college students. However, neither 
were they told that the subjects represented 
a malfunctioning group, such as hospital or 
clinic patients. Under these conditions, one 
would expect that the effects of the stories 
themselves would be maximized. 

The scatter plot revealed that the 600 rat- 
ings made by the 10 judges were distributed 
in a normal fashion over the seven scale points. 

Table 1 presents the summary of the analy- 
sis of variance of the ratings of adjustment. 
It can be seen that the Instructional Set un- 
der which the TAT stories were elicited origi- 
nally produced a significant effect on the judg- 
ments of adjustment. Neither Cards nor the 
interaction between Set and Cards was found 
to produce a significant effect on judgments 
of adjustment. Those aspects of the judgments 
determined by the sexual or aggressive stimu- 
lus patterns of the cards seemed to produce 
an effect of equivalent magnitude. 

It seems reasonable to conclude, therefore, 
that the subject’s attitude toward the testing 
situation can influence not only the test proto- 
cols (Lubin, 1960) but also may produce an 
effect on resulting clinical judgment. 

Further information concerning the effect 
of Instructional Set upon judgments of ad- 
justment is presented in Table 2. Since the 
scale was constructed so that higher scores 
represent judgments of poorer adjustment, the 
order of the means indicates that highest level 
of adjustment was rated for stories from the 
Inhibiting condition, lowest for stories from 
the Facilitating condition, with stories from 
the Neutral condition occupying the inter 
mediate position. The Tukey Studentize 
Range Test reveals that the difference Þe- 
tween the means of the Inhibiting and Facili- 
tating conditions is large enough to be s 
cepted with confidence, but that we canno 
safely say where the Neutral condition falls 
in between the other two. 
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TABLE 1 


ANALYSIS OF VARIANCE OF RATINGS OF ADJUSTMENT 


Source df SS MS F 
Between Sets 2 688.53 344.27 4.561* 
Between Cards 1 7.34 7.34 097 
Set X Cards 2 30.41 15.21 201 
Subjects Within Set X Card Cells 54 4076.30 75.49 
Totals 59 4802.58 
* Significant at the .05 level. 


Although the lists of story cues upon which 
the judges based their ratings were too varied 
to permit meaningful categorization and analy- 
sis, 8 of the 10 judges mentioned “amount of 
sex or aggression” as one of the four main 
cues which they used. The previous investiga- 
tion (Lubin, 1960) demonstrated that In- 
structional Set produced a significant effect 
on both sex and aggressive expression and 
further analysis of the effect indicated that 
the effect was linear, i.e., highest expression 
of sex and aggression in the Facilitating con- 
dition, lowest in the Inhibiting condition, and 
an intermediate position for the Neutral con- 
dition. Thus there is a strong suggestion that 
judgments of adjustment based on TAT sto- 
ties are significantly influenced by the amount 
of sex and aggression which is expressed. 

These findings support the observations of 
Soskin (1954) that when the clinical psy- 
Chologist is requested to make interpretations 
based on projective test protocols, his judg- 
Ments tend to be biased in the direction of 
Pathology, Also, Kenny and Bijou (1953) 
found that when clinical psychologists were 
asked to rank TAT stories according to their 
interpretive significance, they tended to it 
Breater weight to contents which represente 
Expressions of the sexual and aggressive drives. 

The findings of this investigation point to. 
Some of the risks involved in “blind analysis. 

hen the subject has a Set to be spontaneous 
and to individualize himself, he expresses more 
Sex and aggression in his TAT stories (Lubin, 
1960), and when the clinical psychologist, 
Without knowledge of the subject's Set, makes 
a judgment of the subject’s adjustment based 
On these same TAT stories his judgment tends 
E be biased in the direction of pathology to 


the extent to which sex and aggression is ex- 
pressed in the stories. 

It might be objected that in practice the 
clinical psychologist does not make judgments 
of adjustment based solely on the TAT and 
certainly not on such a small number of cards 
as were used in this study. The first part of 
the objection must. be granted; such an im- 
portant judgment would be based upon a bat- 
tery of tests rather than a single instrument. 


‘It should be noted, however, that within the 


test battery, the TAT is likely to be used in 
a variety of ways (Dana, 1956). In addition, 
interpretation of the TAT in the clinical set- 
ting is more often based on global judgment 
than on the few time consuming objective 
methods of analysis which have been proposed. 


SUMMARY 


In order to test the hypothesis that judg- 
ments of adjustment made by clinical psy- 
chologists are influenced by the Set under 
which a subject takes a projective test, two 
TAT stories from each of 60 subjects were 
rated on a seven-point scale of adjustment by 
10 clinical psychologists. The stories previ- 
ously had been elicited under differential In- 
structional Sets: Inhibiting (V = 20), Neu- 
tral (N = 20), and Facilitating (V = 20). 


TABLE 2 


Mean RATINGS OF ADJUSTMENT 
BY SET CONDITION 


Set Mean 
Inhibiting 35.35 
Neutral 38.35 
Facilitating 43.55 
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It was found that Instructional Set signifi- 
cantly influenced the ratings: most pathology 
was rated for stories from the Facilitating 
condition and least for stories from the In- 
hibiting condition. Collateral data suggested 
that the amount of sexual and aggressive ex- 
pression in the stories was the most impor- 
tant factor which influenced the ratings. 
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ORTHOPEDIC DISABILITY AS A FACTOR 
IN HUMAN-FIGURE PERCEPTION 


AURELIA LEVI 


Teachers College, Columbia University 


There is perennial interest in the possible re- 
lationship between personality dynamics and 
Various features of human-figure drawing. 
Though experimental studies frequently fail 
to support such a relationship, the impression 
continues that it does nevertheless exist, and 
an occasional study showing positive results 
helps to keep the impression alive. 

Some of this ambiguity of result may be 
owing to the fact that even if it be true that 
a figure-drawing is a projection of its creator's 
body image (Machover, 1949), the projective 
Process is necessarily limited by perceptual 
limitations, including perceptual prejudices 
(Postman, Bruner, & McGinnies, 1948). In- 
deed, this is the heart of the basic projective 
hypothesis. It follows then that before we 
can proceed to evaluate the peculiarities of a 
drawing—its omissions, overemphases, distor- 
tions, or whatever—in terms of its creator's 
dynamics, we would do well to establish the 
Prejudicial influence of these dynamics on his 
Perceptions, In other words, even assuming 
that a drawing is a projection, before we can 
attribute a bit of elaborate overemphasis in 
the drawing to its creator’s undue preoccupa- 
tion with the part in question, we have to in- 
Sert a middle step and show that his percep- 
tion—as yet uncomplicated by the act of 
bodying forth mental images through paper 
and pencil—has already been shaped by that 
Particular preoccupation. T: his point has been 
Made concisely by Silverstein and Robinson 
(1956): “The assumption that the physical 
body, ‘the body image, and: the drawn figure 
are in isomorphic relation” remains as yet un- 
Justified. From their study they conclude that 

is one-to-one relationship does not seem to 
exist” (p, 340). 


Tt is only with the existence of prejudicial 


influences on the perceptions of several diag- 
nostic groups that the present study is con- 
cerned. The final step, relating these percep- 
tions to the recreative process of figure-draw- 
ing, is beyond the scope of this study. Our 
task will be to separate Silverstein and Robin-- 
son’s three isomorphs into two groups of two, 
and investigate a hypothesized one-to-one re- 
lation between the first two, the physical body 
and perceptions of the human figure (the body 
image). Our diagnostic groups differ from 
each other with respect to orthopedic dis- 
ability—a comparatively simple, objective, 
distinguishing criterion. f 

This study hypothesizes a one-to-one rela- 
tion between the physical disability and per- 
ceptions of the human figure, and that a par- 
ticular orthopedic disability acts as a dislo- 
cating influence on perceptions of the relevant 
body part. Specifically, it is hypothesized that 
subjects with disability of the legs will be un- 
usually sensitive to the legs in drawings of 
the human figure; and that subjects with dis- 
ability of the arms will be unusually sensitive 
to the arms in drawings. 

Since the group of back disabilities is a 
less homogeneous group than the other two, 
and has less well-defined etiologies which have 
been thought to be of psychic origin, it is fur- 
ther hypothesized that this group will show 
a greater resemblance to the characteristics of 
a control group of nondisabled. 


PROCEDURE 
Subjects 


The experimental group was composed of 38 sub- 
jects with orthopedic disabilities as follows: (a) 12 
with traumatic disability (fractures, amputations) to 
the arms, and no other disability; (6) 13 with trau- 
matic disability to the legs, and no other disability ; 
(c) 13 with a variety of low-back disabilities (spinal 
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TABLE 1 
AGE, SEX, AND EDUCATION OF CONTROL 
AND DISABLED SUBJECTS 

Age Sex Education 
Subjects N X Range M F X Range 
Control 35 40.8 19-68 21 14 9.0 2-20 
Disabled 38 41.9 16-67 28 10 8.5 0-20 


fusion; laminectomy; arthritis of the lower back; 
uncomplicated, nonradiating low-back pain of un- 
determined origin), and no other disability. The con- 
trol group was composed of 35 subjects without any 
orthopedic disability. As shown in Table 1, the age 
of the experimental group ranged between 16 and 67, 
mean age 41.9, and of the control group between 19 
and 68, mean age 40.8. The number of years of edu- 
cation for the experimental group ranged from 0 to 
20, mean 8.5 (median 9), and for the control group 
between 2 and 20, mean 9.0 (median 10). 


() + | + = 
A L B 
Standard Units 


Model 


Score: A Score: B 

Fic. 1. Model figure, standard body units, and five 
examples of stick figures containing one standard 
unit each. 
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Instrument 


The instrument was a set of 27 cards, each bearing 
a pair of stick figures (after Sarbin, 1954; Sarbin & 
Hardyck, 1955), and a single large card bearing a 
model stick figure. All the stick figures are composed 
of three parts: an arm unit, a leg unit, and a head- 
and-spine unit. The three parts making up the model 
are considered to be the standard units of reference; 
each of the 54 paired figures contains only one stand- 
ard unit, the other two units being variants (Fig- 
ure 1). Each stick figure appears twice in the set, 
once on the left and once on the right. The figures 
are so paired that 18 of them contain the standard 
arm unit, 18 contain the standard leg unit, and 18 
contain the standard head-and-spine unit, but no 2 
figures on one card contain any identical unit. Thus, 
from an objective point of view, each figure bears 
exactly the same amount of resemblance to the 
model as every other figure. 


Method 


Each subject was instructed as follows: “Here is a 
model figure, on the large card. I am going to show 
you other figures in pairs, and I want you to pick 
out which one of each pair looks more like the model, 
or resembles the model more closely.” Scores for each 
subject are the number of choices he made involving 
each standard unit, the maximum number of choices 
possible for any unit being 18, and the minimum be- 
ing 0. For example, a figure choice is credited to the 
Arm score if the standard unit it contains is the 
standard arm unit; similarly, it is credited to the 
Back score, if the standard unit it contains is the 
standard head-and-spine unit; etc. (Figure 1). 


TABLE 2 


COMPARISON OF MEANS oF THREE DISABILITY 
Groups WITH A CONTROL GROUP 


Group N Mean ifs p 
A Choices 
Group A 12 14.17 3.49 01 
Control 35 8.94 
L Choices 
Group L 13 11.23 5.05 .001 
Control 35 7.49 
B Choices 
Group B 13 12.92 1.48 .20 
Control 35 10.54 


l 
ETHE i tests comparing Groups A and B with the contr, 
group were computed by a method designed for groups "he 
variances are unknown but presumed equal (see Table uted 
t test comparing Group L with the control group was computed 
for the case for which variances are unknown but presu! 
unequal (Walker & Lev, 1953). 
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TABLE 3 


Comparison oF VARIANCES OF THREE DISABILITY 
Groups WITH A CONTROL GROUP AND 
witht Each OTHER 


Group N — Variance F b 
A Choices 
Group A 12 20.7 1.02 ns 
Control 35 20.29 
L Choices 
Group L 3.19 3.35 .05 
Control 10.67 
B Choices 
Group B 13 18.41 1.47 ns 
Control 35 27.02 
Group A 12 20.7 6.49 01 
Group L 13 3.19 
Group B 13 18.41 5.77 01 
Group L 13 3.19 
Group A 13 20.7 112 ns 
Group B 12 18.41 
RESULTS 


As was hypothesized, the arm-disabled and 
leg-disabled groups show & perceptual sensi- 
tiveness that varies with the site of disability. 
For these two groups, means differ from those 
of nondisabled persons (Table 2), at a high 
degree of significance (.01 and .001, respec- 
tively), 

As was hypothesized, 
abilities shows no such perceptu 
from the nondisabled. The mean B Score for 
the back-disabled group does not differ S18- 
nificantly from the control mean (Table is 

An unexpected finding is that the mia 
bled group is far more homogeneous than 


any of the others (Table 3)- 


the group of back dis- 
al difference 


Discussion 

udy—that there 
between certain 
d perceptions 
a necessary, 
a demonstra- 
projection of 


is The main finding of the st 

a one-to-one relationship 

"Ypes of physical disability an 
th the human figure—provides 
ti ugh not sufficient, step for 
On that a figure-drawing 15 4 
S creator’s body image- 


That this one-to-one relationship is not a 
simple, across-the-board condition whose ex- 
istence can be assumed for every kind of 
physical idiosyncrasy may be seen from the 
fact that the group of back disabilities, whose 
origins and symptomatology are much less 
clearcut than those of the other two groups, 
shows a less distinct perceptual prejudice; 
and from the differing variability of the 
groups. 

It may be that the degree to which the dis- 
ability is visually apparent immediately, ex- 
erts an important influence. Under such a 
scheme, low-back sufferers would be at the 
low end of a continuum of immediately ap- 
parent crippling; arm fractures and even am- 
putations of the hand or arm—especially when 
fitted with a cosmetic prosthesis—while be- 
coming apparent enough in any situation in- 
volving interaction, would nevertheless be 
more readily concealable than the impaired 
and distorted ambulation of the leg-disability 
group. Such an explanation might account for 
the findings with respect to the variability of 
the groups as well as their perceptual preju- 
dice. 

It should be noted that the results of this 
study may possibly be dependent on the fact 
that the control group and all three experi- 
mental groups were fairly homogeneous with 
respect to age, education, and socioeconomic 
level. To the extent to which differences in 
these factors might exert competing influences 
—as, for instance, the possibility that an ex- 
tremely gifted, high » Achievement subject 
might bring into play an opposing percep- 
tual prejudice akin to repression (Postman, 
Bruner, & McGinnies, 1948)—the pattern of 
results might be expected to vary; such an 
effect would, of course, be a matter for clari- 
fication through further research. 


SUMMARY 


This study compared three physical-disabil- 
ity groups with a group of nondisabled con- 
trols for their perceptual reactions to a struc- 
tured test involving resemblances among sche- 
matized human-figure drawings. 

Tt was ascertained that subjects with arm 
disability are particularly sensitive to the 


arms in a drawing, and that subjects with leg 
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disability are particularly sensitive to the legs 
in a drawing. Subjects with a variety of low- 
back ailments—often thought to be psycho- 
genic—appear to be closer to the nondisabled 
control in their reactions than do either of the 
other two groups. 

The group of leg disabilities is a much more 
homogeneous group than either the other two 
disabled groups or the controls. 

A possible explanation that would account 
for all these findings is offered on the basis of 
the degree to which the disability is instantly 
apparent. 

It is considered that this study gives sup- 
port to a hypothesized one-to-one relationship 
between the physical body and the body 
image. 


Aurelia Levi 
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TAT PERFORMANCE AS A FUNCTION OF ANXIETY 
AND COPING-AVOIDING BEHAVIOR * 


E. JERRY PHARES 


Kansas State University 


_ A characteristic often associated with anx- 
iety is its potential for generalization. Based 
Perhaps on either qualitative similarity or 
identical elements, anxiety may become at- 
tached not only to the original arousal situa- 
tion but also to other situations. 
r One such situation is that of projective test- 
mg. Anxious patients frequently seem to pro- 
ject threat into the test stimuli. For example, 
Rotter (1940, 1946) has stated that anxiety 
on the TAT is revealed in plots that empha- 
size sudden physical accidents and emotional 
trauma. Similarly, Schwartz (1955); from a 
Freudian viewpoint, has related the expres- 
sion of castration anxiety on the TAT to 
themes in which occur genital or other body 
injury or loss, sexual or personal inadequacy, 
intrapsychic or extrapsychic threat, and loss 
of cathected objects. 
However, to expect every 5 
to produce such themes would seem a too sim- 
Ple approach to the problem. Not all anxious 
People handle their anxiety alike. Some sub- 
Jects, having perceived a threatening stimu- 
lus, become evasive and produce bland sto- 
ries, while others respond directly and create 
emes indicative of threat. The dirann 
drawn here parallels that of the perceptua 
p ense-sensitization dimension (Carpenter, 
einer, & Carpenter, 1956). 
Ina similar a Weisskopf-Joelson, Asher A 
and Albrecht (1957) investigated “label 
Avoidance” as a manifestation of ats 
hey found some support for the hypot a 
at people who strongly repress an ps 
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sociation, St. 


anxious patient 
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tend to avoid expressing this impulse in situa- 
tions carrying its label to a high degree. Ap- 
plied to the TAT this could mean that sub- 
jects with strongly repressed sexual or aggres- 
sive impulses would frequently fail to give 
sexual or aggressive themes to pictures sug- 
gestive of an aggressive or sexual content. 

The hypothesis of the present study is that 
high anxious subjects will show greater pref- 
erence for TAT themes involving accident, 
threat, or trauma than will low anxious sub- 
jects when matched for tendency to evade or 
cope with threatening stimuli. The present 
study resembles one by Goodstein (1954) who 
found a nonsignificant relationship between 
anxiety and preference for anxiety-like TAT 
statements without controlling for coping- 
avoiding tendencies. 


METHOD 


Selection of High and Low Anxious Subjects 


The Taylor anxiety scale (A scale) (1953) was 
used to select high and low anxious subjects. From 
263 general psychology students who took a 50 item 
form of the A scale during pre-enrollment, 25 high 
anxious (scores above 18) and 25 low anxious fe- 
males (scores under 7) were selected. 


Determination of Coping-Avoiding Tendencies 


This technique stems from a distinction between 
copers and avoiders recently made by Mainord 
(1956). He demonstrated that copers recalled more 
nonsense syllables associated with disturbing words 
than with neutral words, and avoiders recalled more 
syllables associated with neutral words. Goldstein 
(1959) utilized the same distinction in predicting dif- 
ferential responses to fear-arousing propaganda. Both 
of these investigations used an incomplete sentences 
blank (ISB) consisting of 40 critical and 20 filler 
stems. The former have direct sexual and aggressive 
implications. Critical items are scored on a three- 
point scale in terms of specificity, strength, and arbi- 
trariness of response. A subject’s score is the sum of 
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the weights assigned to each item and i ae 
indicates coping tendencies. Copers are t us su j A 
who respond most directly to the implications of the 
stems while avoiders are those who respond most 
evasively. Using this technique, two judges independ- 
ently scored 13 protocols from a pretest population 
with 89% scoring agreement, 


TAT Measure 


To increase objectivity, the usual method of ad- 
ministering and scoring the TAT was modified. Seven 
cards (4, 6BM, 7BM, 13MF, 14, 17BM, and 20) 
were presented, each accompanied by six themes: 
four neutral and two embodying threat, accident, or 
trauma. For example, with 6BM: 

neutral—This young man has come to the mansion 

to apply for the job of gardener. He has been 


asked to wait by this elderly maid. So, hat in hand, 
he waits. 


threat—The son h 
that all of their savings have bei 


are both stunned by 
left. 


e score, the lower the prefer- 
ence ranking for threat themes, 


Each subject was tested individually, The Modified 
TAT was administered, followed by the coping-avoid- 


ing ISB. Subjects responded anonymou. ly 
TAT and ISB eee ae 


RESULTS aNp Discussion 


; From the foregoing Procedure it was pos- 
sible to match high and low anxiety subjects 
with respect to ISB scores, The NV of each 
group was 19 and the mean ISB score for each 
group was 35.0, 

The mean TAT score for the high 
group was 47.2 (SD 4.7), and for 

anxious group 53.6 (SD 6.5), Applying at 
test * the difference between the means is sig- 
nificant and in the predicted direction (¢ = 3.4 ; 
»= 001), A 

The results bear out the hypotiresis that 


angi ù „~ n TAT cards 
than q, People see more threvhen their tech- 


nique 4 Onanxjous peagainst anxiety is con- 
trolleq ? defengdierri (1954) reported a non- 
signincalte? trend in a similar direction due 
perhaps to a lack of control for the coping- 
avoiding dimension. Although a portion of his 


anxious 
the low 


2A one-tailed test of the distribution of £ was 
used. 


E. Jerry Phares 


data was based on 10 anxious and 10 non- 
anxious subjects while the present data is 
based on 19 pairs of subjects, the discrepancy 
between probability estimates in the two m 
ies is much greater than would be expecte 
merely by augmenting N. i 

Thus, it seems probable that in a given un- 
selected population of anxious subjects, sorne 
who are copers and others avoiders, the ron 
firmation of an otherwise perfectly logical Ba 
tenable hypothesis might be prevented. e 
example, Lesser (1959) demonstrated = 
under conditions of high anxiety abour a 
gression, the intercorrelations among vario 


S ionificantly lower 
measures of aggression were significantly 


: es- 
than in the case of low anxiety over = oe 
sion. A behavior occurs not re the 

į tion of one variable but as a A som 
relationship among several varia <a 1, tf 
Generalization of the present rest sat aie 
course, limited by the sex pa ee ne 
population and B» the modified ‘T: 
cedure. 
y 
Sl yrmaRY 
m b- 
This study hypothe.,;,ed that ae a 
jects will show greater reference for a 
themes involving accident), threat, or ot 
than will nonanxious sub/Jects when ma co 
for tendency to erdie gat cope with threa 
ing stimuli. 
i he Taylor 
Twenty-five sup‘sects mee PA T 
anxiety scale ancl 25 low o $ ‘al 
administered 3 nodified TAT and akapaa a 
ISB whi h a "ures coping-avoiding ei 
cies, S K measnk ordered neutral and meee? 
io oe Ywhich accompanied seven TA 
ening themes hey fitted the 
cards jy “erms of how well they e 
Cards * n r d 
With this procedure, 19 pairs of high a 
low anxiety female subjects matched for se 
ing scores confirmed the hypothesis at a sta 
tistically significant level. 
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AND INTERACTION BEHAVIOR IN INTERVIEWS’ 
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Purdue University 


This study was undertaken in order to ex- 
plore possible relationships between two dif- 
ferent general approaches to the description 
and measurement of verbal interview behav- 
ior. One widely applied frame of reference di- 
rects attention to the communication aspects 
of verbal behavior, that is, to some symbolic 
dimension of the content of the words spoken, 
using content analysis to define and quantify 
its variables. A second frame of reference 
focuses upon quantitative temporal charac- 
teristics of interview interactions, utilizing 
measures such as number and duration of ut- 
terances, duration of silences, etc. 

The few studies which have incorporated 
measures of both content and temporally de- 
fined variables have usually indicated the 
greater discriminatory power of the latter. 
For example, Page (1953), Lennard (1955), 
Goldman-Eisler (1952), and others have 
found that quantity or tempo of verbal out- 
put is both more stable and more highly cor- 
related with various criteria of personal ad- 
justment or psychotherapeutic sticcess than 
are content-derived variables. 

Lundy (1950) found that a single patient 
seen concurrently by two therapists differing 
in therapeutic technique showed no difference 


1 This investigation was supported by a research 
grant (M-735) from the National Institute of Men- 
tal Health, of the National Institutes of Health, 
United States Public Health Service. This paper is 
based in part on a dissertation, under the direction 
of Frederick H. Kanfer, submitted by the senior au- 
thor in partial fulfillment of the requirements for the 
degree of doctor of philosophy at Washington Uni- 
versity. The data were collected while all of the in- 
vestigators were at Washington University. 


in his responses to the two therapists in meas- 
ures of content (Distress-Relief Quotient, 
Raimy’s self-references, etc.). However, dif- 
ferences in tempo of interaction in the two 
sets of therapeutic interviews were apparent 
in the protocols. As a secondary aspect of a 
subsequent study, Lundy (1955) compared 
the temporal variables with clinical estimates 
of occurrence of significant emotional and 
topical content in “key” interviews. The re- 
sults indicated that the contentual and tem- 
poral variables were related. ; 
The present study involves the direct in- 
vestigation of relationships between more pre- 
cisely measured content variables and the 
temporal measures of the Interaction Chrono- 
graph. This instrument and Chapple’s un- 
derlying interaction theory (Chapple, 1949; 
Chapple & Arensberg, 1940) constitute an ex- 
tensive, systematically developed attempt to 
describe temporal phenomena in verbal be- 
havior and to investigate their significance. 
Definitions of the Interaction Chronograph 
variables involved in the present study are 
given in Table 1. Without more extensive 
knowledge on our part of the subtle dif- 
ferences in “meaning” of the several inter- 
correlated Interaction Chronograph variables 
(Matarazzo, Saslow, & Hare, 1958), precise 
hypotheses as to specifically which coni a 
and interactional variables would be relate 
could not be formulated. However, tentative 
hypotheses regarding the general types of m 
lationships to be expected on a “face validity 
basis were obviously the determinants of the 
kinds of content categories selected for use, 
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Descriptive Content and Interaction Behavior 


TABLE 1 


INTERACTION CHRONOGRAPH VARIABLES 


The number of times the patient 
acted 


The average duration of each action 
plus its following inaction (silence), 
as a single measure 


Pts Units 


Pt’s Tempo 


The average duration of the pa- 
tient’s silences 


The durations of the patient’s in- 
terruptions minus the durations of 
his latencies in responding, divided 
by Pt.’s Units 

The percentage of times, out of the 
available number of opportunities 
(usually 12) in Period 2, in which 
the patient acted again (within a 
15-second limit) following his own 
last action 

The average length of time in Peri 
2 that the patient waited before 
taking the initiative following his 
own last action 

The number of times in Period 4 
that the patient “talked down” the 
interviewer minus the number of 
times the interviewer talked down 
the patient, divided by the number 
of Pts.’s Units in that period 


Pt?’s Silence 


Pts Adjustment 


Pt’s Initiative 


PUS Quickness 


78 A 
Pt’s Dominance 


ae 


and are illustrated below in the description of 


© content system. 


PROCEDURE 
Forty patients randomly selected from new Te- 
mens to the Psychiatric Outpatient Clinic of a lare 
ed medical center were interviewe by ET 
De Chiatrist according to the published jhe d with 
artially standardized interview which is use Pes 
he Interaction Chronograph in order that ie i = 
wer may serve as a partially controlled OF 


e ; defined 
Pendent variable during each of a ere i, & 


bperiods of the interview (Matarazzo %0) Each 
inetttazzo, 1956; Saslow & Matarazzo, 1957) room 
three was observed from an Bares 
who 8h a one-way mirror by thesis Tnter- 
© recorded the ongoing interaction OR an 


actie im sound re- 
cordi Chronograph and made @ verbatim as 40 pa- 


Ording h 
tig, on a Gray Audograph. $ m- 
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i, Ause of deficiencies in the $i a 
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etViey which resulted in © patie nts served as 


the y content. The 30 remaining a 
fem + Jects of the present study. Of non a A 
ale, 13 were male. Their ages ranged from ? 
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61 years, with a median age of 35.5. The most fre- 
quent diagnoses were: hysteria (eight cases), anxiety 
neurosis (seven cases), depression (six cases) and 
schizoid personality (four cases). 

The sound recordings were carefully and repeatedly 
monitored by the typist and a judge, until a high de- 
gree of recording transcription fidelity was achieved. 
Since the reliability of the content scoring process 
had been demonstrated on an independent sample of 
transcripts shortly before (Phillips, 1957), the final 
transcripts were unitized and then categorized, unit 
by unit, by one experimenter. 

The content aspects of the verbal interview behay- 
ior were defined and quantified according to the cate- 
gory system schematically diagrammed in Table 2. 
The system and its development have been described 
in Phillips (1957). It represents an adaptation and 
extension of the Interpersonal System devised by 
Freedman, Leary, Ossorio, and Coffey (1951), C and 
C' in Table 2, which consists of a circular continuum 
of 16 categories, representing qualitative blendings of 
two orthogonal dimensions, love-hate and domi- 
nance-submission. A seventeenth category was added 
for coding of units unscorable or neutral within the 
Interpersonal System. 

In addition to the Interpersonal System, several 
other dimensions were coded in order to achieve com- 
pleteness of coverage and more general applicability. 
These other dimensions were operationally defined 
by: (a) coding of units without interpersonal refer- 
ence, D and D' in Table 2, such as “I sat down and 
ate,” “My head aches,” etc.; (b) coding of the actor 
or subject of a description and of the receiver of ac- 
tion or attitude, if any, A and E in Table 2; and (c) 
coding of the general topic discussed, e.g., marriage, 
finances, symptoms, etc. Provision was also made for 
differentiation of descriptions of motor acts, C and D, 
from description of xonmotor states of being, think- 
ing, feeling, etc., C' and D’. For example, “I yelled 
at her” is an interpersonal act, C, while “I was angry 
at her” is coded as an interpersonal state, C’; “I ate 
dinner” is a noninterpersonal act, D, while “I felt 
sick” is a noninterpersonal state, D’. Those units 
which referred directly to the ongoing interview in- 
teraction and had no referent outside of the cur- 
rent situation (e.g. “Thank you, doctor,” “I agree 
with you”) were classified separately, according to 
their function, within categories adapted from Bales 
(1950), II in Table 2. A residual category was uti- 
lized for unscorable units, III. 

The resulting system is of a pyramidal nature, in- 
volving several series of mutually exclusive cate- 
gories. It was considered particularly suited to the 
purposes of this study because of its interpersonal 
interactional emphasis and its use of content vari- 
ables seemingly parallel to the temporal variables of 
the Interaction Chronograph. For example, it was 
expected that one or more of the Interaction Chrono- 
graph measures of verbal “output” (number and 
duration of utterances, etc.) would be related to the 
frequency of content describing the self as physically 
active, Self C+D in Table 2. Since the former 
measures of verbal output in the interview have been 
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TABLE 2 


CONTENT CATEGORY SYSTEM 


I. Description (units which “tell about” an event) 
A. Actor (subject [person] of unit) 
1. Patient himself 
2. Patient’s family 
3. Patient’s spouse, date, fiancee 
etc. 


B. Time (of occurrence of described event) 
1. Present 
2. Past 

3. Future 

4. Subjunctive 


C. Interpersonal Acts (overt motor behavior in- 
volving two or more people, e.g., telling, 
hitting, leaving someone) 


OR 
C’. Interpersonal States (nonovert states, 
thoughts, attitudes, etc., involving two or 
more people; e.g., thinking of, being angry at, 
or afraid of someone) 
1. Interpersonal System Categories to 16 
17. Neutral, unscorable 


OR 
D. Noninterpersonal Acts (overt motor behavior 
involving only one person, e.g., eating, bathing, 
walking, etc.) 


OR 
- Noninterpersonal States (nonovert states, 
thoughts, attitudes, etc., involving only one 
person, e.g., feeling ill, being sleepy, poor, 
happy, etc.) 
1. Positive (welcome, pleasant, etc. to patient) 
2. Neutral (neutral or indeterminant in sig- 
nificance to patient) 
3. Negative (disliked, unpleasant, etc. to 
patient) 


D 


E. Object (person acted on or with; in interper- 
sonal units) 
(As for A above) 


F. Topical Area (of general life experience) 
1. Financial 
2. Marital—Sexual 
3. Religious—Philosophical 
4. Educational 
etc, 


IL. Direct Interaction (units dealing directly with cur- 


rent interview interaction) 

Agrees, expresses compliance with interviewer 
Disagrees, expresses noncompliance with inter- 
viewer 


C. Asks for information, repetition, 
etc., from interviewer 
etc. 


III. Unscorable 


clarification, 


Phillips, Matarazzo, Matarazzo, Saslow, and Kanfer 


described by Chapple? as involving both a physi- 
cal high energy aspect and an out-going, interac- 
tion seeking aspect, it was further expected that one 
or more of them would be related to the degree 
of emphasis which the patient’s content placed on 
interacting with others, Self C + C’ in Table 2. Since 
the degree to which a patient takes the initiative in 
speaking, when the interviewer deliberately remains 
silent, has been proposed by Chapple as a measure of 
both the “drive” aspect of behavior and of the in- 
dependence and scope of a person’s interpersonal re- 
lationships, it was tentatively expected that Patient’s 
(Pt.’s) Initiative would covary with content meas- 
ures of breadth of interests (number of different per- 
sons and topics mentioned) and with description of 
the self in dominant interpersonal roles. It was such 
expectations as these, then, which guided the choice 
of dimensions of content to be included with the 
overly narrow Leary System in satisfying our goal 
of a comprehensive and multilevel content system. 

The specific content scoring procedures include the 
stipulations that the categories are to be applied by 
the judges with a minimum degree of inference, and 
from the point of view of the patient. That is, all 
coding under this system is performed according to 
the relationship to or impact on the patient of the 
events as he describes them, without consideration of 
possible interpretations of psychological defenses or 
mechanisms, etc. 

The content unit utilized for dividing verbalizations 
into countable and codable segments was basically 
defined as the minimal verbal statement which con- 
sensus of raters indicates to be understood as ex- 
pressing an independent communication or thought. 
Although it was developed and tested independently. 
this unit is very similar to that of Auld and White 
(1956) and of Murray (1956). 

Both the content unitizing and catgorizing proc- 
esses have been shown to have adequate reliability 
when applied independently by trained judges ac- 
cording to detailed definitions and rules (Phillips, 
ine? 

Raw frequency scores (number of content units 
coded for each category) were converted into per- 
centages so that intersubject comparisons would be 
unaffected by differences in absolute numbers of 
unitized items. The percentage scores were trans- 
formed by the arc-sine transformation (Snedecor, 
1946) for purposes of statistical analysis. The ma- 
jor content scores were then correlated (Pearson r) 
with 12 temporal Interaction Chronograph variables. 
Means and other statistics, when obtained, were con- 
verted back to percentages. 


RESULTS 


Table 3 presents the major findings. In or- 
der to conserve space, only those relationships 
which reached statistical significance, and 
those approaching significance (given in pa- 


2E. D. Chapple. Manual for the Interaction 
Chronograph, personal communication, undated. 


Descriptive Content and Interaction Behavior 


TABLE 3 


CORRELATIONS 


BETWEEN CONTENT CATEGORIES AND INTERACTION CHRONOGRAPH VARIABLES 


Interaction Chronograph 


Pes  Pt’s  Pt’s — Pt’s  Pt’s  Pt’s Pts 
nent Units Tempo Silence Adj. Init. Quick. Domin. 
Description (1) i ee ba 
Self Subject (A) 40 - 
Self Acts (D + C) =38 e 
Self Interpersonal (C + C’) iká. 
Self Dominant-Hostile Quad. = 
Self Submissive-Hostile Quad. isis 
Self Dominant-Positive Quad. = 
Self Submissive 7 
elf Positive = 
Self Noninterpersonal (D + D’) = “26 
Self Noninterpersonal = 
oneeitive (D + D’ — 3) 39 
ae Interpersonal (C + C’) 3 n. 
hee Noninterpersonal (D + D’) one Ji ” 
nterpe S++ Cl = 
Dirt sa (C+C) ( 34) (—31) 39 (32) 38 
umber Topics (F) (—.31) 30 =i52 (28) (30 


Number Persons (A + E) 


Note.—r =.36 for p =.05. r=-46 for P 


Tentheses), are included. Table 3 also does not 
include content categories which are essen- 
tially reciprocals of those presented, nor some 
Interaction Chronograph variables which have 
been shown to be highly correlated with those 
included and whose correlations with content 
Categories essentially duplicated these results.” 

Pt?s Units. Because of the relatively fixed 
length of the interview and of the interview- 
ers utterances, Pt.’s Units is highly corre- 
ated in a negative direction with measures 
(Pts Action, Pt.’s Tempo, and Pt.’s Activ- 
Y) of the duration of the patient’s me 
ances, In the present sample it correlates —-70 
With Pts Action and may be assumed to be 
ne of the more stable and representative 
weastires of the general level of the a 
Mes) output because, as 4 frequency see 
re; it is less affected by a few extremely shor 
“long utterances than are the duration meas- 


a i 
Ch The complete matrices for content VS- ment i 
Ajonoeraph correlations have been aye iliary 
Publi Order Document No. 6641 from apt ia Li- 
bray Ications Project, Photoduplication Oia 
ine of Congress; Washington 25, D: oy a hoto- 
copia nce $1.25 for microfilm OF agen 
Cat; cs. Make checks payable to: cues 


io > 
n Service, Library of Congress- 


=01. Parenthetical values 


approach significance at .05 level. 


ures.‘ The results shown in the first column of 
Table 3 reveal that the patient who has fewer 
Pt.’s Units, that is, who speaks relatively in- 
frequently (but with longer durations per ut- 
terance), is shown by these correlations with 
content measures to describe himself as rela- 
tively: more active in his daily living, more 
oriented toward interactions with other peo- 
ple, less concerned with his own solitary ex- 
periences, interested in a wider variety of 
events in daily living, and less prone to evade 
description of himself in general. 

Pts Tempo. This is highly related to Pt.’s 
Units (r = — 84 in this sample) since the 
less frequently a patient speaks in the stand- 
ardized interview, the longer of necessity will 
be his Tempo or “cycle” of speaking. Hence 
the relationships shown for Pt.’s Tempo 
should be considered in close conjunction 
with those for Pts Units. As indicated in 


47he fact that these highly related measures of 
verbal activity level do, however, at times differ 
somewhat in their relationships with external behav- 
jor points out the necessity for retaining all of the 
Interaction Chronograph variables despite some high 
jntercorrelations, until the areas of their overlap and 
independence have been more completely defined and 


explored. 
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Table 3, the longer the average duration of a 
patient’s Tempo, the more he describes him- 
self as dominant-hostile in his dealings with 
and attitudes toward other persons. 

Pt.’s Silence. This is negatively related to 
the percentage of content units which are de- 
scriptive in nature and is positively related to 
the relative frequency of direct interaction 
with the interviewer. One might hypothesize 
that both silence behavior and remarks made 
to the interviewer represent “resistance” or 
avoidance tactics, since the questions asked of 
the patients are nondirective ones calling for 
description of their general life patterns and 
do not deal with events within the interview. 
This hypothesis is also supported by the tend- 
ency noted above for the number of (short) 
utterances to be inversely related to such di- 
rect discussion with the interviewer, as well as 
by the trend for longer silences to be accom- 
panied by the introduction of fewer different 
topics. 

Pt.’s Adjustment. Since all patients hesi- 
tated before responding for longer durations 
than they interrupted, in analyzing the data, 
the minus sign was omitted, so that high 
scores for Pt.’s Adjustment indicate greater 
relative latency of response. The results in 
Table 3 show that patients with high Pt.s 
Adjustment scores can be considered to be 
almost opposite to those with fewer (but 
longer) Pt.’s Units. Thus, “maladjustment” is 
negatively related to the relative frequency of 
description of the self as active and to the 
number of different other persons mentioned 
while it is positively related to the relative 
amount of talk about the self. Further, the 
“slow responders” relatively less frequently 
describe the acts or the attitudes, both inter- 
personal and noninterpersonal, of other peo- 
ple; when they do describe their own in- 
terpersonal attitudes or dealings with other 
people, they relatively more frequently char- 
acterize themselves as taking a submissive 
role, probably one which is also hostile. They 
also mention fewer negative noninterpersonal 
things about themselves while tending to focus 
more upon the noninterpersonal aspects of 
their lives in general. Similar to Pt.’s Silence, 
this latency-of-response measure tends to be 
negatively associated with amount of descrip- 
tive content and positively related to relative 
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frequency of direct interaction with the inter- 
viewer, supporting the hypothesis that this 
distinction between description and interac- 
tion with the interviewer represents avoidance 
in the face of difficulty in communicating. 

The similar nature of the content category 
correlates of Pt.’s Units and Pt.’s Adjustment 
suggests what might be termed an “outward” 
or “other-directed” orientation in those pa- 
tients who speak a fewer number of times and 
also in those who speak with a shorter latency 
of response. Patients who speak more times 
and those who wait longer before speaking 
have content which focuses more upon them- 
selves, and thus might be termed “inward” 
or “self-directed.” This interpretation is sup- 
ported by our recent finding that schizo- 
phrenic patients, who might be thought of as 
being at the extreme pole on a continuum of 
inward-directed vs. outward-directed orienta- 
tion, speak a significantly greater number of 
times (but in shorter average utterances), and 
with much longer average latencies before re- 
sponding, compared to normals (Matarazzo & 
Saslow, 1961). 

Pt.’s Initiative. The more the patient shows 
temporal interactional initiative (speaks again 
following his own last utterance), the more he 
also shows a form of “initiative” in raising 
new topics (and the broader one might there- 
fore infer his interests or concerns to be). A 
similar but nonsignificant trend is shown for 
the total number of different persons men- 
tioned. 

Pt.’s Quickness. These results are supported 
by the finding that another temporal variable 
has similar correlations with the number of 
topics and number of persons mentioned. Un- 
like Pt.’s Initiative, however, Pt.’s Quickness 
is similar to Pt.’s Silence in being related sig- 
nificantly to the relative amount of descrip- 
tive content vs. interaction with the inter- 
viewer. Thus Quickness seems to have two 
components: one, the readiness to com- 
municate in patients who take the initiative 
rapidly, a covariation which is similar to that 
found for Silence; and secondly, a component 
which, like the relationship found for Initia- 
tive, is related to the broader range of con- 
cerns (more people and more topics) of these 
same patients (with the shorter Quickness 
durations). 


i a, S 


Descriptive Content and Interaction Behavior 


Pt.’s Dominance. This is directly related to 
the relative frequency with which the patient 
talks about people other than himself as well 
as to the relative concern he shows for the 
noninterpersonal feelings and behavior of 
others. Particularly striking is the finding 
that the more dominant the patient is in his 
temporal interview behavior, the more he de- 
Scribes himself in content as dominant-posi- 
tive (e.g., teaching, helping, advising, protect- 
ing, etc.) in his attitudes and dealings with 
other people. Thus it seems that for the pa- 
tient whose interpersonal style is described (in 
content) as more dominant, the interruption 
behavior on the part of the interviewer in 
Period 4 is animating and challenging, rather 
than defeating. The findings for Pt.’s Domi- 
nance are very similar to those for Pt.’s Units, 
the content correlates of which also empha- 
sized relatively more discussion of others. 
However, patients with higher verbal output, 
as defined by Pt.’s Tempo, more frequently de- 
scribed themselves as dominant-Hostile rather 
than dominant-positive in their own interper- 


Sonal roles.® 


Discusston AND SUMMARY 
The results seem both encouraging in an 
exploratory study relating two quite disparate 
Phenomena of interview behavior, and sug- 
Sestive in the new meaningfulness which they 
add to the Interaction Chronograph variables 
Y indicating relationships which seem inher- 
ently plausible and internally consistent. They 
Provide a foundation for an approach to be 
Sonality which combines content and tempora 
Variables, and suggest personality dimensions 
Which, underlying as they do at least a 
quite different spheres of behavior, may be 
Particularly pervasive and consistent. in 
Viewed as a whole, the data eae r 
able 3 suggest that patients who speal 


re 

often, who are faster to respond, 5 Tae 
min i i jew (that 15, 

ant in the interview ( er PES 
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There is no relationship i 
cant Units and Dominance (7 
è ), while Dominance and Pts ; À 
Batively correlated (r= — 45) Pt Sni 
Ssitively correlated with pt’s Units, 
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tent which is relatively more oriented towards 
other people and towards interpersonal inter- 
action, with social roles which are relatively 
more frequently described by them as domi- 
nant, either in a paternalistic or a hostile 
fashion. On the other hand, the correlations 
imply that the more a patient loses or sub- 
mits to interruptions, is hesitant in speaking, 
and is less active verbally, the more his con- 
tent emphasizes his own noninterpersonal con- 
cerns rather than interaction with others, and 
the more submissively hostile is his self-de- 
scribed role with other people. 

In conjunction with the results of other va- 
lidity-oriented studies of interview behavior 
correlates (Hare, Waxler, Saslow, & Mata- 
razzo, 1960; Matarazzo, Matarazzo, Saslow, 
& Phillips, 1958; Matarazzo & Saslow, 1961), 
the present findings are a beginning at de- 
scribing the characteristics which suggest a 
significant and cohesive description of the pa- 
tient and how he interacts with others, viewed 
both from the subjective (content-inferred) 
and objective (temporally measured behav- 
ior) levels of observation. The major impli- 
cation of these results bears upon the degree 
of generalizability of the Interaction Chrono- 
graph constructs and hence has to-do with 
their concurrent (and, more remotely, con- 
struct) validity. 

The correlations shown in Table 3 are small 
in the sense of accounting for relatively little 
of the variance, although respectable for com- 
plex personality variables. Further, the com- 
plete correlation matrix contained 336 r’s (28 
content categories vs. 12 Interaction Chrono- 
graph variables), of which 28 were significant 
at the .05 level or better. Since 17 would have 
been expected to be significant at this level 
by chance alone, a replication study is being 
undertaken to determine whether the present 
findings can be cross-validated with another 
sample of subjects. A number of hypotheses 
are suggested by the results of this validity- 
oriented study which can be pursued in fu- 
ture investigations if the present findings are 
borne out in the replication study. 
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THE CONCEPT OF NORMALITY: 
A REPLY TO FREIDES 
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P In an interesting paper published in this 
Journal, Freides (1960) presses for the elimi- 
nation of the concept of normality on the 
Srounds that there is little agreement concern- 
ing its definition, that it ascribes absolute, 
Yet culture bound, patterns of behavior and 
that it disregards the flexible interactions be- 
tween personality and circumstances. Al- 
though Freides concerns himself nearly ex- 
Clusively with the idealist-adjustment view of 
Normality, other approaches such as the sta- 
“istical-average conception are likewise dis- 
Missed with the conclusion that “for purposes 
of scientific theory and also for practical clini- 
Cal purposes” our emphasis must shift away 
JOM considerations of normality (or patho- 
Senicity) and toward a greater concern with 
the potentialities of every person under the 
Proper conditions.” Such a position, it will be 
argued here, springs from too narrow a con- 
“eptualization of both scientific theory and 
Clinical purposes. It has, moreover, some un- 
vatunate implications for future research de- 
¢lopments and basic orientations in clinical 
Psychology. f 
Although normality is (today) an admit- 
i ly low powered concept, judgments of A 
ality-Pathogenicity hover in the backgroun 
Many diagnostic and therapeutic decisions. 
Seems that, however ambiguous and ill- 


elned a conception of normality our pro- 
is term nonethe- 


sional 6 i 
le, Onal behavior reflects, tms 

ny behaves the way a good construct onla 
y E 1953): as a summarizer of a hos 


$ sage ioral 
Personality characteristics and behaviors 
person’s 


fut encies, as a gross predictor of a a 
“re behavior and as an object of oon 
cuss last role will be made clearer aS this dis- 

Sion, develops. 
flects (as I believe 


Our clinical practice re 


it does) the use of some concept of normality, 
it becomes appropriate to inquire into what 
kind of meaning can be ascribed to it. The 
controversy over the idealist-adjustive vs. the 
statistical-average interpretations of normality 
has obscured some more basic considerations. 
Why should the question, “What is normal- 
ity?” have a different logical status than, say, 
the question, “What is schizophrenia?” That 
a diagnosis of schizophrenia, for instance, im- 
plies a hypothesization of an inner structure 
or state which is but inadequately described 
by an operational definition, and that such 
taxonomic conceptualization is a scientifically 
valuable activity has been convincingly argued 
by Meehl (1959). Although our present day 
conception of schizophrenia is riddled through 
with indeterminacies, we would be hard put 
to do without it—clinically or scientifically. 
The concept of normality suffers from the 
same type of inadequacies, only more so, since 
it belongs to a far less imposing nomological 
network than does schizophrenia. 

In the case of both schizophrenia and nor- 
mality we are asking legitimate questions to 
which, unfortunately, no sufficiently complete 
or specific answer can be given at this time. 
But scientific theory as it is understood today 
(Hempel, 1952) accepts vagueness or open 
endedness in the real definitions (i.e., defi- 
nitions involving a statement of the essen- 
tial nature or characteristics) of concepts. We 
should understand the question, “What is nor- 
mality?” not as a demand for an unequivo- 
definition (as Freides implicitly does) but 


cal 
as a request for an empirically based specifi- 
cation of the indicators (prevalence of con- 


trol factors? level of corticoseptal integration? 
degree of reasonableness? appropriateness of 
autonomic arousal? etc.) which may be found 
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to reflect with varying probability the exist- 

` ence of this hypothetical state. During the 
early stages of a science, the specification of 
the meaning of concepts is, as Kaplan (1946) 
notes, “a provisional one, both as to the indi- 
cators included and the weights associated 
with them.” 

The definition of normality, the hunt for 
the indicators of its essential characteristics, is 
thus construed as a processive affair marked 
by high responsivity to new data. An illustra- 
tion incorporating some suggestive data might 
be of interest. To begin with, let us consider 
a first temporary indicator of normality such 
as nonexposure to psychiatric diagnosis and 
treatment. As an operational definition of 
normality, this criterion would obviously be 
grossly inadequate as would many others. Our 
gambit rests on the hope that it may eventu- 
ally enable us to lift ourselves by our boot- 
straps (Cronbach & Meehl, 1955) by calling 
attention to other “purer” criteria which cor- 
relate (imperfectly) with it and which may 
in turn possess greater validity than our origi- 
nal indicator does. Many of our more success- 
ful constructs in psychology (e.g., Binet IQ) 
have evolved in this way. Using such a first 
indicator we would orient ourselves to dis- 
covering what other indicators (and charac- 
teristics by implication) compose a matrix of 
significant relatedness. It is clear at this point 
that this approach, in contrast to the idealist 
or average views of normality, would lead us 
to an empirically based conceptualization of 
normality having the essential character of a 
theoretical construct, We might pause to ask 
how much reliable data do we happen to have 
concerning these “nonexposed” (or otherwise 
discriminated) normals? Surprisingly little. 
From reading the psychological literature a 
Martian might be led to believe that the pro- 
portion of schizophrenics in the United States 
1s 807% instead of .8% or that the nonexposed 
represent only some 10% of the population 
when they should number closer to 90%. 

A perusal of the literature dealing with such 
loosely defined normals leaves one major im- 
pression: there is an unexpectedly high dis- 
tribution of “pathogenic” traits, histories, be- 
haviors, in this population while by contrast 
many of the allegedly normal (ideal-adjusted) 
characteristics we have been taught to expect 
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are by no means typical of it (Lapousse & 
Monk, 1958; Renaud & Estess, 1955; Scho- 
field & Balian, 1959). For our purposes this 
would suggest that quite a few of the admi- 
rable, clearly nonpathogenic traits and behav- 
iors with which we have traditionally invested 
normality relate poorly to this concept as de- 
fined (preliminarily) by the nonexposure and 
other similar indicators. Parenthetically, this 
represents an attack on the same idealist- 
adjusted definition of normality that Freides 
finds unsatisfactory—without, however, elimi- 
nating normality as an object of scientific 
search. On the contrary, research on normal- 
ity should be quickened and stimulated by 
such findings, presuming they stand up under 
more systematic investigation. Are there, for 
instance, suppressor-control variables which 
override the effect of such pathogenic events 
(Schofield & Balian, 1959), and are they the 
indicators of normality? Or will the answer 
involve a favorable interaction between cer- 
tain constitutional factors and patterns of 
personal history? We certainly do not know 
the answer at this time. There can be little 
doubt, however, that the concept of normal- 
ity constitutes an object of search that is not 


only scientifically appropriate but also of 


great moment to psychology. Such research 
emphasis may well be a factor in helping to 
precipitate a much needed shift away from 
the basic orientation to pathology that char- 
acterizes many of our efforts today. Clinical 
psychology needs to reacquaint itself with its 
most unique and natural subject matter— 
normal man. It is unfortunate, as Sanford 
(1958) has pointed out, that “we .. . talk 
much more freely about symptoms of illness 
than about symptoms of health or symptoms 
of resiliency or of strength or of spontaneity 
or of creativity” (pp. 82-83). 

Little has been said in this rejoinder con 
cerning the aspect of the problem of normal- 
ity which bears Freides’ strongest criticisms, 
the evaluative component of normality. OU" 
approach simply sidesteps this argument alto” 
gether. Normality as here conceived involves 
absolute values or cultural biases no more ane 
no less than does, e.g., our concept of schizo- 
phrenia. Certainly normality is culture boun® 
but only to the extent that all concepts a"? 
“bound” to the phenomena they describe 
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explain. Likewise, what kind of value judg- 
ment would be involved in the empirical de- 
termination that, e.g., a specific suppressor- 
control variable constitutes an essential char- 
acteristic of normality? As far as we can see 
the only values involved in this enterprise are 
the same that motivate all scientific research 
and as such they are rightly accepted as neu- 
tral with reference to the content of science. 
Although the bulk of this paper has been 
devoted to a defense of a conceptualization 
of normality, we might briefly consider the 
alternative that Freides champions, i.e., that 
We concern ourselves instead with the specific 
abilities and limitations that individuals dem- 
onstrate under specific circumstances. This 
Point of view has a long history in psychology 
aS the specificity theory of personality, and 
as such has been much discussed over the 
years, Some general comments, however, seem 
appropriate here. First, in addition to seeking 
the highest possible accuracy in the explana- 
tion and prediction of behavior, science also 
Attempts to be maximally comprehensive and 
Parsimonious. Freides’ approach represents an 
admission of failure with respect to these de- 
prlerata, Secondly, the two approaches are by 
ae deans incompatible. Classifying a person 
as “normal” does not by any stretch of the 
'Magination preclude an appreciation of his 
imitations under certain circumstances, It 
Can easily be conceived that broadly specified 
ronational variables may play 4 significant 
ei in behavior prediction equations. To w 
ent this is true we do not know. 
er possible, once this typological ap 
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has begun to yield some fruits, that we will 
be able to shift to systematic studies of the 
processes which favor normal psychological 
growth. It is clear, however, that psychology 
would be poorly served if we gave up our 
search for the meaning of normality—a po- 
tentially powerful explanatory and predictive 
construct. 
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THE EFFECTS OF TWO VERBAL TECHNIQUES 
ON THE EXPRESSION OF FEELINGS: 


GUSTAV LEVINE 2 
Teachers College, Columbia University 


A problem frequently encountered in psy- 
chotherapy is the client’s use of impersonal, 
nonemotional statements. The therapist’s re- 
quests for statements referring to the client’s 
feelings are frequently met with descriptions 
of situations or impersonal observations, even 
though the client is highly motivated to co- 
operate. Therapists therefore frequently at- 
tempt to behave in a manner that facilitates 
their client’s emotional expression. 

Rogers, in his early writings, stated his 
observation that reflection of feelings results 
in the immediate expression of further feel- 
ings by the client (Rogers, 1942, p. 158). 
Studies of recorded therapy sessions which 
utilized categories of clients’ expressions of 
feelings and therapists’ reflections of 
have not examined the specific seq 
reflection and expression of feelings (Berg- 
man, 1951; Seeman, 1949; Snyder, 1945). 
Verplanck (1955), in an experiment involy- 
ing a social rathet than a therapeutic situa- 
tion, found that paraphrasing, a technique 
which is descriptively similar to reflection, 
increased the class of statements paraphrased, 
Although feelings were not paraphrased, but 
rather statements of opinion, the results are 
encouraging to a hypothesis that reflection of 
feelings increases expression of feelings. In 
the Verplanck study, paraphrasing of the class 
of responses to be increased was the only re- 
sponse given by the interviewers. This re- 
stricted attention to one class of response may 


feelings, 
uence of 


1 This paper is taken from portions of a thesis sub- 
mitted to the Department of Clinical Psychology, 
Teachers College, Columbia University, in partial 
fulfillment of the requirements for the PhD degree. 
The author wishes to express his appreciation to his 
Chairman, Joel Davitz, for his aid and encourage- 
ment. 

? Now at the Creedmoor Institute for Psychobio- 
logic Studies. 


have been the factor which reinforced this 
response (rather than the specific paraphras- 
ing technique), implying that any technique 
involving restricted attention to one class of 
response would reinforce that class, Qn the 
other hand, the specific technique (paraphras- 
ing) may be reinforcing in the same way that 
approval can be reinforcing (Murray, 1956). 
The contention that restricted attention is the 
significant factor is supported by the work of 
Salzinger and Pisoni (1958, 1960), who found 
that simple statements such as, “I see,” 
“Yeah,” “Uhha,” “Mmm-hmm,” etc., could 
act as positive reinforcers of expression of 
feelings. 

The present study involves a comparison of 
the specific technique of reflection of feelings 
with a simple undifferentiated vocalization, 
“Mm-hm,” both applied as the only response 
given, in separate interviews, and occurring 
only after expressions of feeling. In such a 
Pair of interviews both interviewer techniques 
would constitute restricted attention to a class 
of response, and any special advantage of the 
more complex response could express itself in 
more effective reinforcement. 

It was hypothesized that reflection of feel- 
ings results in greater expression of feelings 
than does an undifferentiated vocalization 
(“Mm-hm”). 

The subjects were 30 male undergraduate 
students, living at college, who volunteere 
and were paid for their participation. 

The experiment was explained as a study of 
feelings about school life. The experimente" 
Stated that he was “interested in what you are 
experiencing” as a student. The subjects ea¢? 
received two similarly structured interviews 
which differed only in the way in which = 
experimenter responded to expression of fee 
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ings by the subjects (reflection in one, “Mm- 
hm” in the other). 

The interviews were scheduled one week 

apart and tape recorded. One half of the sub- 
Jects received one condition first, the other 
half receiving the other condition first. 
š The interviews were rerecorded with the 
interviewer’s responses omitted, and with the 
tape divided into one-minute segments of talk 
by the subjects. The presence or absence of 
expression of feelings in a single minute was 
determined by six psychologists. These judges 
used a detailed set of instructions given to 
them by the experimenter as a frame of refer- 
ence for their judgments.? 

The basic data for the testing of the hy- 
Pothesis was the number of minutes contain- 
ing expressions of feeling. 

The one-minute segments were rated for 
Presence or absence of feeling by both the 
experimenter and a judge. The agreement was 
‘igh, a phi coefficient of .92 having been ob- 
tained through a transformation of the com- 
Puted chi square value of 428.9. ; 

_ The average number of minutes of feeling 
in each of the two interview conditions was 
Computed. In the reflection interviews there 
Was an average of 10.07 minutes containing 
Xpression of feelings, and in the “Mm-hm 
interviews there was an average of 9,83 min- 
Fs containing expression of feelings. At 
T of the difference yielded a nonsignificant 

of .23, p 
ton results indicate that under the ppd 
li nS of the present experiment there is _ 
difference in effectiveness between reflection 
te a instructions are in the appendix to ei 
versit nesis can be obtained on mie pon Ae 
Hy, z Microfilms: 313 North First r et at, 
remig eiehigan. Order L.G. Gard Nowe = 

ing $2.00. 
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of feelings and the undifferentiated vocaliza- 
tion “Mm-hm,” as techniques for increasing 
expression of feelings. The additional factor 
of clarification of feelings through re-expres- 
sion in different words does not seem to in- 
crease the frequency of expression of feelings 
beyond that obtained with any technique 
which responds only to feelings. 

The experiments of Verplanck (1955) and 
Greenspoon (1955), indicate that the two 
techniques can each be effective compared to 
no technique (operant level), but there was 
no control in the present experiment with 
which to make such a comparison. 
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BRIEF REPORTS 


ANNIETY IN VERBAL BEHAVIOR: 
AN INTERCORRELATIONAL STUDY 1 


MERTON S. KRAUSE 


University of Michigan 


Several objective measures of anxiety in verbal 
behavior have been proposed in the last 3 dec- 
ades. Their claims to validity are relatively weak, 
generally resting upon their apparent reasonable- 
ness. If they appear to reflect the same “inner 
state” and so yield highly correlated measure- 
ments on the same behavior samples, their claims 
would be stronger. They might be said to show 
concurrent validity. 

Ten-minute recorded samples from the therapy 
sessions of 15 hospitalized male mental patients 
were studied. Eight purported measures of anx- 
iety were applied to each patient’s verbal response 
in the 15 protocols. The measures were (a) num- 
ber of words spoken, (b) number of words/num- 
ber of inspirations, (c) number of verbs/number 
of adjectives, (d) latency of the response, (e) 
number of references to the interviewer, (f) and 

(g) number of speech disruptions as measured 
by Mahl (1956) “non-ah” ratio and Dibner 
(1956) Cue Count I and (h) rate of speech. 

An intercorrelation matrix Was computed for 
each of the 15 protocols to study the amount of 
individual differences, and then the matrix of 
median intercorrelations was derived. The aver- 
age level of correlation in this later matrix was 


1 This study was supported under Grant M-516 
C-7 from the National Institute of Mental Health, 
E. S. Bordin, Principal Investigator. 

An extended report of this study may be obtained 
without charge from Merton S. Krause (2343 Auburn 
Avenue; Cincinnati 19, Ohio) or for a fee from the 
American Documentation Institute. Order Document 
No. 6642, from ADI Auxiliary Publications Project, 
Photoduplication Service, Library of Congress; Wash- 
ington 25, D. C., remitting in advance $1.25 for 
microfilm or $1.25 for photocopies. Make checks 
payable to: Chief, Photoduplication Service, Library 
of Congress. 


.06, and so no general convergence of the several 
measures is evident. If these measures do possess 
some degree of concurrent validity it was not 
uniform enough over persons to appear in our 
sample. The cluster pattern in the matrix of me- 
dian intercorrelations does suggest some conver- 
gences in our set of measures. As might be ex- 
pected by inspection of the measures themselves, 
the two speech disruption measures were highly 
correlated (rg; = .91), while verbs/adjectives and 
number of words were moderately loaded on the 
same factor (about .42 and .24, respectively)- 
This pattern of Measures f, g, c, and a held up 
for about half of the protocols. In at least two 
protocols, however, this pattern clearly disinte- 
grated, d 

The average results have a very large sampling 
error over persons. The average interquartile 
range for the median values was .45. Thus, there 
are marked individual differences in the level, of 
intercorrelation among the various verbal anxiety 
measures. This implies that different measures 
may be valid for different persons and that what 
measurement values indicate anxiety or nonanx- 
iety may also be idiosyncratic. These results sug- 
gest that verbal measures are not going to be any 
less troublesome to validate than are physiolog!- 
cal measures of anxiety. 
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VERBAL AND PERCEPTUAL COMPONENTS IN WISC PERFORMANCE 
AND THEIR RELATION TO SOCIAL CLASS * 


JOHN B. MARKS 
Veterans Administration Hospital, American Lake, Washington 
anp JAMES E. KLAHN 
Tacoma Public Schools 


Most investigators have found measured intelli- 
pence positively related to social class and have 
ound this relation closer with verbal rather than 
eal materials, With the WISC, however, 
ape (1953) found higher status superiority only 
l a 7-year-old level and not at the 10-year-old 
a Moreover, she found no consistent pattern 
ifferences of subtests between her upper social 


Stoup and her lower group. 
wees present study relates social class 
( ISC measures of verbal-nonverbal difference: 
a) Verbal 1Q-Performance IQ, and (b) the dif- 
pence between weighted scores of subtests highly 
ne on Cohen’s (1959) verbal factors an 
‘eighted scores of subtests highly loaded on the 
Perceptual factor, This latter was the mean of In- 
capaation, Comprehension, Similarities, and Vo- 
any minus the mean of Picture Completion, 
= Design, and Object Assembly. ; 
vid ubjects were 211 primary school children a 
Yo led by age and sex into four groups: Bot! 
ci ae groups were within 6 months of their 
K th birthday at testing while the older ne 
Aly Within 6 months of their eleventh birth 
fe ge been tested because of some school gi i 
nat Y but children with IQs below 70 were e! ar 
ple in sample selection. Mean IQs of the sa 
vere close to population means. . 
a information about the fathers een 
Clas each subject was assigned to an occupa! ee 
Ge 8roup ranging from 1, casual laborer, is 
Siness leader. The r between jndependent ra 
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ings of the two authors was .94 and means of the 
two ratings were used. 

Occupational ratings correlated positively with 
IQs in both the younger and the older groups. 
Though the correlation in the younger group is 
higher it is not significantly so. On the other 
hand, the girls show a substantially higher cor- 
relation than do the boys. For verbal and full- 
scale IQs this difference is significant, .42 com- 
pared to .19 for verbal and .45 to .17 in the full- 
scale. 

The two measures of verbal-nonverbal differ- 
ence were tested for their relation to occupational 
level, both directly through correlations, and by 
contrasting the difference measures for unskilled 
and semiskilled labor children with those for 
white collar children. The difference between ver- 
bal and performance IQs was in the expected di- 
rection but not significant. The difference in the 
factor-derived measures had a significant r of .16 
(p<.05, V= 211) with occupational level and 
the white collar children showed a greater verbal 
superiority (t=2.41, p < .01, 113 df). 

These results are consonant now with results 
using other instruments. On the WISC both 
younger and older groups show a correlation of 
IQ with occupational class, and this correlation 
is higher when verbal materials are used than it is 
for nonverbal. The closer relation between occu- 
pational level and IQ among girls than among 
boys may stem from the higher peer value which 
girls put upon middle class verbality. 
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SENSORY DEPRIVATION AND ITS RELATION 


TO PROJECTION: 


MALCOLM H. ROBERTSON axp ROBERT C. MARTIN 


University of Florida 


The study was designed to test the hypothesis 
that sensory deprivation lowers the threshold for 
Projection. To test this hypothesis, a control 
group and an experimental group of 10 subjects 
each were used, half male, half female. The con- 
trol subjects received no deprivation and were 
tested individually for projection using a modi- 
fied autokinetic technique. The technique con- 
sisted of Presenting the subject with a dim source 
of light approximately 1 mm. in diameter at a 
distance of 9 feet. The subject was told to watch 
the moving pinpoint of light and, when it went 
off, to report what it Suggested, or looked like, or 
made him think of. There Were 12 1-minute trials 
with a 2-minute rest Period after the sixth trial. 

The experimental subjects were exposed to sen- 


1 An extended report of this study may be obtained 
without charge from Malcolm H, Robertson (De- 
partment of Psychology, University of Florida; 
Gainesville, Florida) or for a fee from the Ameri- 
can Documentation Institute, Order Document No. 
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toduplication Service, 


Library of Congress; Wash- 
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microfilm or $1.25 for Photocopies. Make checks 
payable to: Chief, Photoduplication Service, Library 
of Congress, 


sory deprivation for 3 hours and then tested im- 
mediately with the autokinetic technique in the 
same manner as the control subjects. In the depri- 
vation condition, each subject wearing opaque 
goggles, cotton mittens, and cardboard cuffs, was 
placed on a bed with his head inside a foam rub- 
ber lined box. 

The two groups were compared in terms of the 
number of responses as well as the number of 
stimulus-bound responses, original responses, and 
popular responses. Stimulus-bound responses were 
those that referred solely to the movement or di- 
rection of movement of the light. An original re- 
sponse was one that was given only once, by only 
one person, and in addition was considered by 
the two investigators to be very novel or un- 
usual. A popular response was one that was found 
in several records, usually occurring more than 
once in a record. 

It was expected that the deprivation group 
would show more projection (greater productiv- 
ity, larger number of original responses, an 
fewer popular and stimulus-bound responses) 
than the control group. Differences between the 
two groups were not statistically significant. 
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PSYCHIATRIC OUTPATIENT PERSONALITY PATTERNS? 


MARY HELEN TATOM 
Spring Grove State Hospital, Baltimore 


In a previous study (Tatom, 1958) five patients 
Were selected on the basis of medical diagnoses 
as Tepresentatives of each of four commonly oc- 
curring nosological entities: obsessive-compulsive, 
hysteric, anxiety state, and outpatient schizo- 
Phre nic. They were rated by their respective in- 
dividual therapists on each of 67 personality vari- 
ables, and the resulting scores intercorrelated. 
The 20 x 20 matrix of person-to-person correla- 
Hons was factored by Thurstone’s centroid method 
to yield four primary and two second-order fac- 
tors, tentatively identified with clinical syndromes. 

To test the stability of these patterns, 2 refer- 
ence individuals for each of the four primary fac- 
Ors were inserted into a matrix with 16 new pa- 
tients, selected to represent equally the original 
our Clinical diagnostic entities. A factor analysis 
Paralleling the first was carried out. The position 

reference individuals with respect to primary 
actors in the two analyses is given in Table 1. 
rait patterns of structurally corresponding fac- 
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tors in the two analyses suggested that syndromes 
were replicated in the second analysis, though 
overlap of specific items was not extensive (e.g., 
two internally consistent clusters of clinically 
schizoid traits had in common only seclusiveness 
and poor social adjustment). Primary factors 
were tentatively identified with the respective 
factors of the original analysis as follows: out- 
patient paranoid schizophrenic, conversion hys- 
teric, socially mature personality, and passive- 
dependent personality vs. antisocial personality. 
The latter was less clearly defined by reference 
individuals than were the other three. Patients in 
both analyses were grouped largely on the basis 
of second order factors, rather than highly cor- 
related primary factors. Factor and trait patterns 
tended to confirm two second-order factors: 
schizothymia vs. cyclothymia, and uncontrolled 
emotionality vs. overcontrolled emotionality. 

Psychiatric and factorial classification of pa- 
tients did not agree in either analysis; syndromes 
isolated factorially corresponded only roughly to 
the nosological entities represented by patients in 
both groups of patients analyzed. 
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IDENTIFICATION IN TERMS 


OF PERSONAL CONSTRUCTS: 


RECONCILING A PARADOX IN THEORY? 


ROBERT E. JONES 


Veterans Administration 


In this study, identification is defined as per- 
ceived similarity of self and others, experienced 
in terms of personal constructs. The author ac- 
cepts the definition of phenomenological psy- 
chologists and the thinking of psychoanalysts, 
such as R. P. Knight, who emphasize identifica- 
tion as a relationship rather than a process, and 
as a perceived rather than as an actual similarity. 

The psychology of personal constructs, the 
theoretical system developed by G. A. Kelly, is 
a perceptual approach to the prediction and ex- 
planation of human behavior, One’s personal con- 
structs are the vehicles, verbally expressed, by 
which one anticipates the behavior of others and 
guides his own behavior, Constructs are dimen- 
sions defined by terms which the perceiver ac- 
cepts as opposites or contrasts, The way in which 
two persons are seen as alike yet different from a 
third would be a construct. A form of the Role 
Construct Repertory Test is employed to meas- 
ure identification with “significant others.” 

Identification with male figures in the Reper- 
tory test was taken as the central measure in the 
present research. Repeat-test reliability was es- 
tablished in the high .80s. The subjects were 36 
hospitalized males with mild or moderate psychi- 
atric disorders, and a control group of 36 normal 
males, matched on age and education, 

The central hypotheses were two: (a) neuro- 
psychiatric (NP) patients more often than nor- 
mal adult males will either overidentify or under- 


1 An extended report of this study may be ob- 
tained without charge from Robert E. Jones (Psy- 
chology Service, Veterans Administration Hospital; 
Danville, Illinois) or for a fee from the American 
Documentation Institute. Order Document No. 6646 
from ADI Auxiliary Publications Project, Photo- 
duplication Service, Library of Congress; Washing- 
ton 25, D. C., remitting in advance $1.75 for micro- 
film or $2.50 for photocopies. Make checks payable 
to: Chief, Photoduplication Service, Library of Con- 
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identify with personally significant male figures; | 
and (b) the personal construct matrices of NP 
patients will be simpler than those of controls. 

The hypothesis of underidentification is sup- 
ported only at the 10% level, by one-tailed ¢ test. 
The hypothesis of overidentification is significant 
at the 1% level: NPs are more likely to see 
others as extremely like the self than are nor- 
mals. Also as predicted, the idiographic factors 
required to “explain” the interpersonal matrix 
are significantly fewer (at the .05 level) for NPs 
than for the controls. For NPs, but not for nor- 
mals, the more simple the factor matrix the more 
fully it is explained by a value construct (.01 chi 
square significance). 

The major contribution of the Personal Con- 
struct approach to identification theory lies in the 
reconciliation of previous theories. Both E. H. 
Erikson and O. H. Mowrer are partly right, Both 
over- and underidentification are common badges 
of maladjustment. Both are associated with 4 
common defect—a factorially simple, value-laden 
system of constructs. Dimensions of perception 
permitting useful discrimination become inopet” 
ant. With the construct system compelling polari- 
zation into “good guys and bad,” we see, wit 
Sanford, how “desperation” promotes classica 
identification of the all-or-none variety. Identifi- 
cation with a hated object tends to be uncon- 
scious and accomplished by introjection. Tdentifi- 
cation with an idealized object tends to be con 
scious and achieved by assimilative projection 
Sanford’s “identification proper” is the uncom 
scious type, always desperation motivated. Eithet 
type of identification, whether excessive or de a 
cient, can be explained in terms of an oversimpli 
fied construct system, preempted by the valu 
dimension. 15 

The findings support Rinder and Campbells 
contention that both over- and underidentific” 
tion reflect the same neurotic dynamic: in t 
terms, “undue reaction-sensitivity”; in ours, | at 
due perceptual restriction to the value dimensi? 
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A STUDY OF READING DISABILITY 


RICHARD H. WALTERS,! MALLE VAN LOAN, anp IRENE CROFTS = 


University of Toronto 


The psychoanalytic theory of reading dis- 
ability (Blanchard, 1946; Fenichel, 1937; 
Strachey, 1930), which is closely related to 
Freud’s (1924) theory concerning hysterical 
blindness, has received relatively little atten- 
tion from research psychologists in spite of its 
Popularity among psychoanalytically oriented 
therapists, In a recent outline of this theory, 
Jarvis (1958) has drawn attention to three 
factors that supposedly characterize the re- 
tarded reader: an avoidance of looking, the 
Problem of aggression, and faulty identifica- 
tion mechanisms. Jarvis regards the “active 
Part of looking” as creating the major diffi- 
culty for the retarded reader; the counterpart 
of this activity outside the school “is an in- 
ability to identify predominantly with one’s 
Own sex” (p. 468). 

he studies reported in this paper were sug- 
gested by the psychoanalytic theory of read- 
Ing disability. Since it is the alleged sexual 
Significance of reading that, according to PSY 
“hoanalysts, gives rise to fear or avoidance 
°f looking, it was hypothesized that retarded 


aders would show greater hesitation in look- 
NS at a sexual object than would either av- 
d that, through 


frage or advanced readers, an 

Seneralization, this “avoidance” would also pe 
S ‘dent in perceptual tasks involving identifi- 
ation of nonsexual objects. It was further hy- 
Pothesized that retarded readers would dis- 


aY hostility toward the same-sex parent, @ 
Ndition that might reflect both aggression 
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METHOD 
Subjects 


Ss were 58 Grade 3-6 boys from a single suburban 
school. They were selected from a larger group of 86 
boys who were of average intelligence (IQs between 
88 and 112) and who were, according to their medi- 
cal histories, free of eye anomalies, hearing losses, 
and behavior problems. Of the 86 boys, 20 had read- 
ing ages that were at least 1 year below their men- 
tal ages; these boys were regarded as retarded read- 
ers. Eighteen boys whose reading ages were at least 
1 year above their mental ages were regarded as ad- 
vanced readers. From the remaining 48 boys, the 20 
boys with the smallest discrepancies (less than 6 
months) between reading age and mental age were 
selected as average readers. 

Reading ages and mental ages were available in the 
school records; since the reading tests and intelligence 
tests had, in some cases, been administered several 
months apart, the mental ages were corrected to cor- 
respond to the time at which the reading test was 
given. Unfortunately, the tests administered varied 
to some extent from grade to grade. However, since 
the administration of further tests would have too 
greatly disrupted the school timetable, the available 
indices were accepted as reasonably adequate bases 
for group selection. The tests given were, in fact, 
very similar, e.g., the Otis group test and the Do- 
minion Test of Learning Capacity as measures of 
intelligence and the Gates Vocabulary, Shenell Vo- 
cabulary, and Dominion Silent Reading Tests as 


measures of reading ability. 


Apparatus and Tests 

Two perceptual tests * developed for the Cerebral 
Palsy Project of the Department of Psychology, Uni- 
versity of Toronto, were included in the test battery. 
The Steer-Allen Figure-Ground Confusion Test con- 
sists of 15 cards, on each of which is depicted an ob- 
those outline is interrupted by the background. 
In order to identify the object, S must free the figure 
from the confusing background detail. Since this test 
was developed for studies of children younger than 
those used in this study, it was employed primarily 
as a rapport pbuilding test at the commencement of 


ject w 


promising results for boys, but negative results for 
girls. Mimeographed copies of this study may be ob- 


tained from the senior author. 
4 The authors are indebted to H. O. Steer both for 


permission to use these tests and for his helpful ad- 
vice on this study. 
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Fic. 1. Multiple-choice apparatus, showing nude doll 
used as test object. 


testing. The Steer-Beatty Closure-Threshold Test, a 
more suitable test for children in Grades 3-6, con- 
sists of 12 sets of five cards, each card being com- 
posed of dots, Each set forms a graduated series of 
representations of the same object. The degree of 
clarity of the representations is a function of the 
relative spacing of dots that compose the figure and 
dots that compose the background. The cards in each 
set are presented consecutively in an order that makes 
the identification of the object (figure) increasingly 
less difficult. These perceptual tests were included be- 
cause they appeared to require “active looking” in the 
sense in which this term is used by psychoanalysts. 

A multiple-choice apparatus (Figure 1) was used 
for the major test of the “fear of looking” hypothe- 
sis. A box, approximately 3 feet long and 14 feet 
high, was separated into four compartments. Each 
compartment was fronted by a separate door which 
opened easily. The back of the box also opened to 
allow the experimenter (E) to insert an object into 
any one of the compartments. The doors and the back 
of the box were connected with a buzzer and an elec- 
tric timer. When the box was completely closed, back 
and front, the electrical circuit was completed and 
both the buzzer and timer were set in operation. 
Opening any one of the doors interrupted the circuit 
and shut off both the buzzer and the timing mecha- 
nism. This apparatus provided an automatic record- 
ing of the interval between E’s signal for S to re- 
spond (buzzer) and S’s opening of one of the doors 
to look at a hidden object. Three objects were used 
during testing: a female Dutch doll of a conventional 
type (neutral object), a nude male doll with a penis 
(the type sometimes employed by psychoanalytically 
oriented child psychiatrists), and clothed boy doll (a 
second neutral object) .® 

Since psychoanalysts have stressed the importance 
of “underlying” unconscious psychological determi- 
nants, and some researchers (e.g., Friedman, 1952) 


The choice of a nude male doll as the test ob- 
ject was based on psychoanalytic symbolism (Fenichel, 
1937; Freud, 1953; Strachey, 1930) concerning the 
act of looking and the mastery of reading. The play 
therapy dolls were lent by Alice Moultby of the 
York Township Child and Adolescent Guidance 
Clinic. 
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have claimed that these can be diagnosed only 
through the use of projective techniques, a brief pic- s 
ture-story test was added to the test battery. Four $~ 
pictures were included: © 

1. A boy is shown turning his back on an older 
male figure, who is walking away in the opposite di- 
rection. 

2. A boy is shown looking into a bathroom. An 
arm of a taller figure protrudes from the shower 
curtain, and water can be seen coming from the rose 
of the shower. 

3. A boy stands in front of an older seated male 
figure, who is reading a newspaper. The boy is 
gesticulating with arms outstretched. 

4. A boy is shown looking into a room in which | 
an adult male and female are embracing. j 

Pictures 1 and 3 were chosen to test hostility t0- 
ward the father; Pictures 2 and 4 to test fear and 
avoidance of looking. 

An attempt was made to assess identification by 3 
modified version of Fiedler’s (1958) Assumed Simi- 
larity to Others (ASo) Test. This test, howevels 
proved to be beyond the comprehension of the 
younger children in the study and, consequently, the 
results were of little value. A simple test of parent i 
preference, described below, was also given. 


Procedure 


S was brought to the experimental room by a fe- 
male E and was seated at a desk facing the discrimi- 
nation apparatus. Æ seated herself to the rear of the » 
apparatus about 4 feet away from S. e 

E first presented the Figure-Ground Confusio® 
Test, using the following instructions: “I am going 
to show you some pictures in each of which a thing 
is hidden. I want you to tell me what that hidde” 
thing is.” S was given a trial run with a card repre 
senting a bird, after which Æ pointed out the deta” 
of the bird. The remaining pictures were then pr | 
sented one at a time. No time limit was set. If S say 
that he could not find the object, E simply presento 
the next card. S’s responses were recorded on da 
sheets. e 

S was now asked to stand in front of the discrim” 


> ; n 
nation apparatus and was given the following ” ? 
structions: 


Here I have this doll [E held up the Dutch aoli] 
and I am going to hide it in one of these box® 

I want you to sce if you can find it. You ca? d] 
this by opening one of these doors [demonstrat o 
as soon as you hear the buzzer, When you hear ink 
buzzer you can open the door in which you w in 
I have hidden the doll. Don’t close the door a8” 

until I tell you to. 


I, 

S was given three trial runs with the Dutch ane 
then the experiment proper was begun. The doll ck 
placed into the various compartments from the st” 
of the apparatus in a predetermined random fhe 
quence. S was given 10 trials with this doll- 


3 4 ed by 
° These pictures form part of a series develop! 
Albert Bandura of Stanford University. 
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oe and choice of box in each trial were recorded 

E now took out the nude father doll, holding it 
up in the air in full view of S. E said: “Look at 
this doll! Now I am going to do the same thing with 
this doll, and I want you to do exactly as you did 
before.” Again the doll was placed, in a predeter- 
mined random sequence, into each of the compart- 
ments. Ten trials were given with the father doll, and 
the results were recorded as before. 

Finally, E took the clothed boy doll. Holding it 
up, E said: “Now I am going to use this doll, and 
7 Want you to do exactly the same as before.” This 
time the doll was placed in each compartment in an 
Orderly sequence: 1-2-3-4, 1-2-3-4. The purpose of 
this final set of trials was twofold. It was thought 
that the subsequent presentation of a neutral object 
might reduce any emotional upset produced by the 
nude doll; in addition, through the use of a regular 
Sequence, it seemed likely that S$ would finish up by 
making some “correct” responses, so reducing pos- 
sible feelings of failure. The plan was to continue 
hiding the boy doll until S had made two successive 
Correct responses, Within the time limit imposed by 
the school schedule, this was not possible in all cases. 

The design of the experiment would have been im- 
Proved if presentations of a neutral doll and the nude 
doll had been made in random order. However, to 
ane the possibility of emotional upset and of 

bsequent parental complaints about the use of a 
Nude figure, it was thought wiser to buttress the 
Presentations of the nude figure with preceding and 
Ubsequent presentations of neutral figures. 
hot Was now taken to a second female E, who = 

been associated with the presentation of the nude 
igure. E seated S beside her and took out a quarter. 

© said: 

you are going on 
h your mother or 
with both. I am 
go with your 
r. You call. 


ay is I want you to imagine that 
, Ong, long trip. You can g0 with 
ue father, but you cannot £0 
f ing to toss this coin. “Heads” you 
ather, “tails” you go with your mothe! 


A me j 
p preliminary trial was given to insure that S on 
tan with instructions. The instructions were m 
mpy repeated, and a further trial was given. 
iisa was recorded on each trial. 
: Now gave the picture-story 
Structeq: gi the pi y 


test. S was in- 


rant you to make me up a story about tisti 
T: Tell me what the little boy 15 doing, be se 
ing to this, how the little boy is thinking 

» and how it will all turn out. 


` Pre 

oe probes given were repetitions Pe 
Were “inal request or variations 0 these. 

T recorded on tape and later transcribet 
nap, #bortive ASo test was then adminis x 
nd at a point m the n 
from her and presentec 
The sets of cards were 
e was shown the first 
d was asked: “What 


room 


y 8 feet aw 
Dresen, Ure-Threshold Te: 
Most ed in a standard order. S 

difficult) card in a set an 
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TABLE 1 


DISTRIBUTION OF ERROR-FREE RECORDS ON 
THE FIGURE-GROUND CONFUSION TEST 


Advanced Average Retarded 
Errors Readers Readers Readers 
Present 7 7 2 
Absent 11 13 18 


Note.—Chi square = 4.797; p < .10. 


does this look like to you?” If S failed on this card, 
he was shown the next card in the series and was 
asked what it looked like. The procedure was con- 
tinued until S responded correctly or until all five 
cards in a series had been presented. On Cards II to 
V, if S said he did not know what the card looked 
like or that it did not look like anything, E said: 
“Perhaps you can tell me what it looks most like. 
What do you think it might be?” The procedure was 
continued until all 12 sets of cards had been shown. 

The testing was now complete, and S was taken 


back to his classroom. 


RESULTS 


Avoidance of Looking 

As expected, a large number of Ss obtained 
a perfectly correct record on the Figure- 
Ground Confusion Test. Consequently, a chi 
square analysis was performed with the data 
divided into two categories—errors present 
and errors absent. The distribution of cases 
among the three groups of Ss is given in 
Table 1. The results were in the predicted 
direction; however, the distribution could 
have occurred by chance 10 times in 100. 

Results of the Closure-Threshold Test were 
d as follows. For each set of cards, S’s 
f the number of cards re- 
rect response; if S failed 


assesse 
score consisted 0 


quired to elicit a cor 


TABLE 2 
P VARIANCE BY RANKS OF RE- 


ANALYSIS O! 
THE CLOSURE-THRESHOLD TEST 


SPONSES TO 


Sums of Ranks 


Advanced Average Retarded 

Readers Readers Readers 

(N = 18) (N = 20) (N = 20) H 
580.0 701.5 6.219* 
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TABLE 3 


ADJUSTED GROUP MEAN LATENCIES (IN SECONDS) OF 
á RESPONSES TO THE Nupe DoLL 


Advanced Average Retarded 
Readers Readers Readers 
(N = 18) (N = 20) (N = 20) 
1.417 1.744 2.044 
TABLE 4 


ANALYSIS OF COVARIANCE OF LATENCIES OF RESPONSES 
to Nune DoLL 


Adjusted Adjusted , 
Source SS df MS F 
Between groups 3.137 2 1.568 10.181** 
Within groups 8.331 54 0.154 


* > <.001. 


to respond correctly to the fifth (the easiest) 
card, he was arbitrarily assigned a score of 6. 
These component scores were summed to pro- 
vide a total score for each S. Total scores were 
then ranked for a Kruskall-Wallis analysis of 
variance by ranks (Siegel, 1956). Results were 
as predicted (Table 2). 

The median latency of response to the first 
two dolls in the multiple-choice task was com- 
puted for each S. Medians were preferred to 
means as a measure of central tendency be- 
cause of the possible presence of a single de- 
viant response within a series of trials. Using 
the responses to the nude doll as the depend- 
ent variable, an analysis of covariance was 
carried out with responses to the Dutch doll 
as the covariant. Table 3 gives the adjusted 
group mean latencies to the nude doll, and 
Table 4 the results of the analysis of covari- 
ance. A test of homogeneity of regression had 
previously indicated that this procedure was 
justifiable (F = 1.95; p > .05). Differences 
were in the predicted direction and were sig- 
nificant at the .001 level. Subsequent ¢ tests 
indicated that this result was largely due to 
the superior performance of the advanced 
readers, whose performance was significantly 
better (p < .001, one-tailed test) than that 
of either of the other two groups. Because all 
three groups of Ss tended to show some de- 
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crease in latency in response to the nude doll | 
(undoubtedly a practice effect), a further k 
analysis of covariance of responses to the nude í 
doll was carried out, this time with responses 
to the boy doll as the covariant. Significant 
differences in the predicted direction were ob- 
tained (F = 6.46; p < .005). 


Parent Preference Test 


The results of the coin tossing test (second 
trial) are given in Table 5. The percentage 
of retarded readers showing preference for the 
same-sex parent was smaller than that of 
either of the other two groups (p < .02): 
When results from the preliminary trial were 
included in the analysis, the differences among 


the three groups were considerably reduced 
(Table 6). 


Picture-Story Test 


The stories were scored from typescripts bY 
an assistant who did not know to which grouP 
individual Ss belonged. Instructions for scot 
ing were as follows: 


Story 1. Score + if taller figure is identified as oi 7 
father and if the boy in the picture expresses ‘ig i 
tility toward the father (i.e., is described as “angry. 
“mad,” or as wishing to harm the father). 


TABLE 5 


1nG 
DISTRIBUTION OF PREFEREN 


ES oN Cory-Toss: 


Test: SECOND TRIAL m 
= r es a 
Advanced Average Reward 
Preference Readers Readers Rea 
4 
For father 13 10 16 
For mother 5 10 


Note.—Chi square = 10.561; p < .01. 


TABLE 6 a 
„Tossi 
DISTRIBUTION OF PREFERENCES oN Co1n-T 


TEST: COMBINED TRIALS 
A = d 
ard? 
Advanced Average Rene $ 
Preference Readers Readers Re 

For father on both 3 
trials 9 7 

For mother on 17 
one or both trials 9 13 


Note,—Chi square = 5.339; p < .10, 
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Story 2, Score + if the boy in the picture (a) does 
not mention the figure behind the shower curtain, or 
(b) is described as unwilling to enter the bathroom 
because someone is already in there, or (c) is de- 
scribed as being afraid, ashamed, or guilty, because 
he has seen the figure in the shower. 

Story 3. Score + if seated figure is identified as 
the father and if the boy expresses hostility toward 
the father (as for Story 1). 

Story 4. Score + if the boy in the picture (a) does 
Not describe the man and woman as kissing or mak- 
ing love, or (b) is described as being afraid, ashamed, 
Or guilty because he has seen the adult figure mak- 
ing love, 


For the purpose of analysis, Ss were re- 
garded as showing hostility toward the father 
if they received a + score on either Story 1 
or Story 3, and as showing fear of looking if 
they received a + score on either Story 2 or 
Story 4. Subsequent chi square tests failed to 
Support the hypotheses being tested. 


Discussion 


Only the “fear of looking” tests could pro- 
Vide crucial support for the psychoanalytic 
theory. One of these, the picture-story test, 
yielded completely negative results. The mul- 
tiple-choice task, on the other hand, may be 
interpreted as supporting the psychoanalytic 

cory, However, in view of the negative re- 
Sults of the picture-story test, alternative in- 
terpretations must be favored. 

Since the differences between advanced read- 
ers and both the other groups of Ss on the 
multiple-choice task were significant at the 
°01 level, interpretation could depend heavily 


EY S, 
“Pon the characteristics of over-achievers 


and particularly upon information about child 
i č to produce high- 


"aining practices that tend i 

S ieving =a ak Unfortunately, little ie 
nown about the family backgrounds of aa 
ie who are exceptionally advanced ser 
Che On the other hand, the studies nea 
d elang, Atkinson, Clark, and Lowell G ne 
fa give some information cona need 
amily backgrounds of Ss with hig hac! 
in evement, Tt is possible that pS 5 
inc teaders fall within the larger ee. d 
aq duals who are highly oriented 


) i is assump- 
ti n Svement in general. Making eee 
teag Oe might expect that aie a 
Con i5 Come from homes 1P which ae 
"siderable stress on independence tramite? 
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in which the parents are democratic rather 
than autocratic, and in which conventional 
moral standards are not highly stressed. It is 
this latter factor that may be important in 
interpreting the results of this experiment. A 
child who is not deterred from peering into a 
compartment that, if his choice is correct, con- 
tains a nude figure with genitals may thus be 
the product of a home in which conventional 
moral standards are not stressed. 

Observance of conventional moral stand- 
ards in the area of sex training involves con- 
siderable emphasis on modesty (Sears, Mac- 
coby, & Levin, 1957). If the emphasis on 
modesty training is not so strong in the homes 
of high achieving children, one might expect 
that these children have had opportunity to 
see their same-sex siblings and fathers in the 
nude. Therefore one would not expect the 
over-achieving children in this study to be 
greatly deterred from responding quickly in 
a task involving a nude doll of the same sex 
as themselves. 

McClelland et al. (1957) have found that 
individuals with high need achievement tend 
to come from homes in which the parents 
make strong demands for independence, in- 
cluding strong achievement demands. Thus, 
these children may not only be relatively un- 
inhibited about sexual matters, but may also 
have been reinforced for exploratory behav- 
ior. This latter factor, along with low empha- 
sis on modesty training, may be partly re- 
onsible for the obtained difference between 


sp 
; hievers and the other two groups 


the over-ac 


of Ss. 
Since, for all three groups of Ss, there was 


a decrease in median latency over the three 
sets of trials, it seemed possible that the in- 
clusion of the nude doll as a test object was, 
in fact, of little importance, and that the re- 
sults of the discrimination test merely Te- 
flected differences in learning capacity in a 
perceptual-motor task. This interpretation is 
partly borne out by Figure 2. In this figure 
changes in test object (Dutch doll, nude doll, 
boy doll) are ignored. The median response of 
each S on each block of 5 trials was first 
identified. These median latencies were then 
averaged to provide an indication of changes 


in latency Over a series of 30 trials. The fig- 


ure indicates that advanced readers improve 


©——© ADVANCED 
aa AVERAGE 
@—e RETARDED 


SECONDS 
a 
o 
T 


130 


a a 
1-5 


6-10 is, 16-20 21-25 26-30 
TRIALS 


Fic. 2. Change in latencies of advanced, average, 
and retarded readers over a series of 30 trials, for all 
stimuli in the “looking” test. 


in performance at a much faster rate than do 
retarded readers, Average readers show an 
initial improvement that is intermediate be- 
tween that of the other two groups, then per- 
form somewhat erratically, These results sug- 
gest the necessity for a further test in which 
a single test object is utilized and in which 
trials are continued until asymptotes are 
reached for all three groups. 

On the two perceptual tests the retarded 
readers performed more poorly than both the 
average and the advanced readers. Once again, 
these results could be interpreted as support- 
ing the psychoanalytic theory. A simpler ex- 
planation, however, is that perceptual skills 
are highly developed among advanced readers 
and poorly developed among retarded readers, 
and that the development of these skills is re- 
lated to reinforcement of exploratory behay- 
ior by care-taking adults. 

The above explanation of the findings im- 
ply that fear of looking is not a causative 
factor producing differences among the three 
groups of Ss in their responses to the nude 
doll. It is suggested that, at the most, par- 
ents who inhibit exploratory behavior are also 
nonpermissive in their modesty training and 
that, as a consequence, children who are re- 
tarded readers tend also to be sexually in- 
hibited. In this connection it is important to 
remember that during a child’s early years 
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sexual behavior largely occurs in the form of 
curiosity or exploratory behavior involving 
perceptual-motor responses. ; ; 
In an attempt to integrate the findings con 
cerning hostility and identification into a ten- 
tative theory concerning the antecedents of 
reading disability, the fantasy data will be 
ignored on the grounds that, in spite of a 
widespread utilization in clinical settings, suc 
data seem to have no consistent relationship 
with supposedly corresponding overt re- 
sponses. The coin tossing test suggests that 
boys who are retarded readers are relatively 
hostile toward their fathers, , 
Let us now make the further assumption, 
for which child training studies afford some 
justification, that mothers tend to be some- 
what nonpermissive concerning exploratory 
behavior and that fathers are more variable 
in this respect. In this case, the amount of re- 
inforcement which exploratory behavior re- 
ceives will depend considerably upon the fa- 
ther’s behavior patterns. Hostility to the i 
ther may then be viewed as an outgrowth o 
paternal nonpermissiveness and panitiyener 
i.e., frustration, of exploratory behavior, ;. 
which sexual behavior is an important facet. 


SUMMARY 


Psychoanalysts have attributed reading = 
ability to three, supposedly related, facto 
fear and avoidance of looking; hostility, Pi 
marily toward the same-sex parent; and fa 
ure to identify with the same-sex paene ie 

Hypotheses suggested by psychoana oh 
theory were investigated in a study in w Re 
Grade 3-6 boys were employed as Ss. a 
tarded readers performed more poorly orm 
perceptual tasks involving recognition o They 
than did average and advanced readers. Jook 
were slower in opening a compartment to val 
for a male nude doll than were the ae iek 
groups. In addition, they chose their re 
less often on a simple parent preference T 
On the other hand, fantasy data failed to hos 
port either the “fear of looking” or the 
tility hypothesis. inter- 

Some of the above results may be! ihe 
preted as supporting the aie 
ory. An alternative interpretation in and 
of parental conditioning of apaa 
sexual responses was nevertheless favored- 
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PSYCHOLOGISTS’ JUDGMENTS OF PHYSICAL 
HANDICAP FROM H-T-P DRAWINGS 


ORVAL G. JOHNSON 
Lewis County Schools, Washington 


The psychological characteristics of the 
physically handicapped person have been 
studied increasingly in recent years. Force 
(1956), Lerner and Martin (1955), Shere 
(1956), Whitehouse (1953), Wrightstone 
(1957), and Cruickshank (1955) suggest 
various degrees and kinds of differences psy- 
chologically between handicapped and non- 
handicapped persons. On the other hand, Levy 
and Michelson (1952) and Wenar (1956, 
1958) find no differences or only minor dif- 
ferences between the two groups. Berreman 
(1954) looks at the problem from the stand- 
point of the attitudes shown toward handi- 
capped people, which he says are different 
from attitudes toward normals, and are likely 
to affect the self-images of the handicapped. 
Wawrzaszek, Johnson, and Sciera (1958) 
found no differences between handicapped and 
nonhandicapped children on any of 10 vari- 


ables derived from the House-Tree-Person 
Test. 


Problem 


The purposes of this study are twofold: 

1. To determine whether or not physically 
handicapped children project into their draw- 
ings any feelings about their handicaps to the 
extent that the handicap can be detected by 
psychologists through an analysis of their 
drawings 

2. To investigate the characteristics of 
drawings that psychologists use to postdict 
which of a matched pair of children is handi- 
capped 


AND 


FRANK WAWRZASZEK 


Eastern Michigan University 


one each Perthes hip, muscular dystrophy, congenital 
deformity, slipped epiphysis, and brittle bones. 

These children were selected from a slightly larger 
group, eliminating those who might be hampered in 
their drawing by impaired motor coordination. This 
selection was accomplished with the consultation 0 
the physical therapist. 

A control group was made up by matching each 
of the handicapped on the basis of age, sex, and IQ. 
The chronological age of each control was within 3 
months and the IQ within 10 points of that of the 
handicapped child. Therefore, by definition, the av- 
erage chronological age and IQ for both groups were 
the same. Actual computations showed that there was 
no significant difference between the two groups 1? 
CA and IQ. All but a few of the handicapped an! 
control children had been tested with individual in- 
telligence tests. Table 1 shows the mean, range, a” 
SD of CA and IQ for both groups. k 

The H-T-P test was administered to each child 1" 
accordance with the instructions suggested by Buc 
(1948). The matched pairs of protocols, randomize 
for order of handicapped and nonhandicapped, We! 
presented to nine psychologists with the following 
instructions: 


These are H-T-P protocols of 37 pairs of ch 
dren matched for sex, CA, and IQ. One of the chi A 
dren is physically handicapped and one is his noni 
handicapped control. You are to judge from T 
protocol which is the physically handicapped oe 
and which is the control, Please clip a slip of pas 
onto the protocol of the handicapped child, notin: 
briefly the basis for your decision in each case: 


50° 
CAPPED AND NoNHANDICAPPED Groups ON CHRON 
LOGICAL AGE AND INTELLIGENCE QUOTIENT 


TABLE 1 
r 
Mean, RANGE, AND STANDARD DEVIATION OF HAN? 


tent 
Chronological Age Intelligence Quotie 
ai 
E han 
Me g Handi- Nonhandi- Handi- NO” ped 
ĪETHOD Measure capped capped capped CaP? 

The H-T-P test was administered to 37 handi- imes rer | 
ca l children in the special classes of ; ic Mean 116.3 118.3 103.4 Ho 12 
capped € 5 e special classes of a public Range 6-5 t0 13-6 6-6 to-13-5 63 to 128 63 19.7 

M z 5 3- -6 to 13-5 63 8 i 
school. Sixteen were postpolio cases, seven cardiac, SD 25.6 261 13.9 a 
seven cerebral palsied (mild), two spina bifida, and ns 
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Judgments of Physical Handicaps from Drawings 


All of the psychologists had experience in working 
With children, a majority of them having done psy- 
chological work with physically handicapped chil- 
dren, The drawings of the Person were evaluated ac- 
cording to the Goodenough (1926) scale, and an MA 
based thereon derived for each subject. The drawings 
of the Person were also measured for height. 


RESULTS 


Of the 333 judgments (9 judges X 37 judg- 
Ments for each), 208 or 62% were correct. 

his proportion is significantly different (.01 
level) from chance expectation. The percent- 
ages correct for the judges individually ranged 
from 54 to 65. None of these is significantly 
different from chance. 

In many cases there was substantial agree- 
Ment among the psychologists as to which of 
the protocols was that of the handicapped 
child. Sometimes they agreed and were right, 
and sometimes they agreed and were wrong. 
Ti we arbitrarily assume that more than two- 
thirds agreement by the psychologists is sub- 
Stantial agreement, then it can be said that in 

7 out of 37 possible cases, or 73%, there 
Was substantial agreement. In 19 out of the 
cases where substantial agreement 0C- 
curred, the majority made a correct judg- 
Ment. Thus, when they agreed they were right 
in 70% of the cases, wrong in 30%. 

Practically all of the reasons given by the 
Judges for classifying a protocol as handi- 
Capped or nonhandicapped involved psycho- 

Ynamic interpretations. Some typical reasons 
are as follows: 

Treatment of end gables of house. Anxiety Te- 
ected in shading on trees. 


Although I like the person I do feel there is some 


‘tortion 
r . iety ected 
treeorted perspective on house. Anxiety refle 
Tee. 
; > ifficulty 
in uch anxiety in human drawing. Great diffi 
M form i 


Faltering lines. Hands omitted. 
mall person in relation to other L 
ingòe handling of the branch structure Gi o 
e Kates some confusion; the general quailty 
1S0n is inferior. 
off pecure house. Height relation to 
Tunk which implies trauma. l 
‘sability suggested in figure drawing. 


T aspect of 
th Lo determine whether or not some Sse 
es Crawings other than the oi ae the 

"istics may have influenced the Jucses, 


f the tree 
f the 


width. Chopped 
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TABLE 2 


DISTRIBUTION OF Majority JUDGMENT BY 9 JUDGES OF 
36 Matcuep Pars on Ricut-WronG AND Goop- 
ENOUGH MENTAL AGE VARIABLES 


Majority Judgment 


Goodenough MA Right Wrong Total 
Handicapped lower than control 21 2 23 
Handicapped higher than control 2 11 13 

Total 23 13 36 


Note.—One of the matched pairs of children had equal Good- 
enough mental ages. Tables 2 and 3 do not include this pair. 


Goodenough scores were determined by an 
analysis of the drawing of the Person. Table 2 
shows the relation between rightness or wrong- 
ness of the majority judgment and the vari- 
able of whether the handicapped child’s Good- 
enough MA was higher or lower than that of 
the control. 

From Table 2 it is apparent that the ma- 
jority judgment was correct in 23 of the 36 
possible cases, and that the Goodenough MA 
of the handicapped child was lower in 23 out 
of 36 cases, not significantly different from 
chance. (The average Goodenough MA dif- 
ference was not significant at the .05 level.) 
It is evident also that there was a marked 
tendency for the majority judgment to be 
right when the Goodenough MA of the handi- 
capped child was lower than that of the con- 
trol. The majority judgment was usually 
wrong, however, when the Goodenough MA 
of the handicapped member of the pair was 
higher than that of his control. The chi square 
for the data in Table 2 was significant be- 
yond the 001 level, after correction for conti- 
nuity according to Yates. These tendencies 
are accentuated if only those cases were 
chosen where there was over two-thirds agree- 
ment among the judges. Table 3 shows the 
distribution of 26 “substantial agreement” 
cases. 

Table 3 shows strikingly what has been the 
pattern of choices by the psychologist judges. 
In every case in the table, the judges (as a 
group) chose as the handicapped child the one 
with the lower Goodenough MA. In 18 out of 
26 cases they were right. (It will be remem- 
pered that the average Goodenough MA of 
the controls was greater than that of the 
handicapped.) They were wrong in the eight 
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TABLE 3 


DISTRIBUTION OF “SUBSTANTIAL AGREEMENT” JUDG- 
MENTS ON R1GHT-WRONG AND GOODENOUGH MENTAL 
AGE VARIABLES 


Group Judgment 


Goodenough MA Right Wrong 
Handicapped lower than control 18 0 
Handicapped higher than control 0 8 


cases where the handicapped child’s Good- 
enough MA was higher than the control’s. 
Several of the judges gave as reasons for 
their decisions some evidence of self-deprecia- 
tion, of a depressed self-concept in the proto- 
cols of some youngsters whom they therefore 
judged to be handicapped. A tendency to 
minify the Person was considered by some 
judges to be a projection of inferiority feel- 
ings arising as a result of the handicap. The 
height of the Person was measured as a check 
on this assumption. In 22 out of 37 cases, the 
Person drawn by the handicapped child was 
larger than that of the matched control child. 


This proportion is not significantly different 
from chance for N = 37. 


Discussion 


The most likely inference is that the judges, 
while verbalizing psychodynamic bases for 
their decisions, were using primarily the in- 
tellectual characteristics of the drawings in 
judging which drawing belonged to the handi- 
capped child and which to the control. They 
attributed the “better” drawing to the con- 
trol and were right oftener than wrong, pos- 
sibly because the Goodenough scores of the 
controls were higher in 62% of the cases than 
those of the handicapped children. If one took 
only the Goodenough MA scores and “judged” 
the better drawing to be that of the control, 
his percentage of correct judgments would be 
almost exactly the same as that of the total 
(or average) for the nine judges. It is pos- 
sible, of course, that there are dynamic vari- 
ables associated with “goodness” of the draw- 
ings, and the judges used these as criteria for 
their decisions. It is also possible that psy- 
chologists with intensive experience in the 
interpretation of drawings would have had a 
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larger percentage of correct judgments, al- 
though the study of Schmidt and McGowan 
(1959) suggests otherwise. 


SUMMARY 


H-T-P drawings of 37 pairs of elementary 
school age children, one physically handi- 
capped and one a control matched for age, 
sex, and IQ, were judged individually by nine 
psychologists to postdict which drawing was 
made by the handicapped child. The percent- 
age of correct judgments was not significantly 
above chance for any one judge, although 
when the judgments were pooled the com- 
bined percentage correct was significantly 
greater than chance. There was a strong and 
significant tendency for the judges to attribute 
the drawing with the higher Goodenough 
score to the control subject. The handicapped 


‘child’s drawing of a Person tended to be 


larger than that of the control, but the dif- 
ference was not significant. 
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CHANGES IN INTELLECTUAL FUNCTIONS OF CHILDREN 
IN A PSYCHIATRIC HOSPITAL 


E. WESLEY HILER ann DAVID NESVIG 


Mental Health Research Institute, Fort Steilacoom, Washington 


It is a well-known fact that emotional dis- 
turbances can be intellectually incapacitating, 
interfering with concentration, learning, mem- 
ory, judgment, and reasoning. Hence it is to 
be expected that such disturbances will not 
only interfere with academic functioning, but 
will also impair performance on psychological 
tests. 

Despite the fact that IQ scores are widely 
interpreted as measures of intellectual ca- 
pacity, the experienced clinician usually re- 
gards the test scores of seriously disturbed 
children as measures of intellectual function- 
ing at the time of testing rather than as rep- 
resenting actual intellectual potential. The 
latter basic capacity or potential is usually 
inferred from those aspects of the test per- 
formance, such as the vocabulary level, which 
are assumed to be less influenced by emotional 
disturbance. Then, too, when a patient whose 
general test performance is poor or mediocre 
does well on some of the difficult items, one 
is led to suspect that the actual intellectual 
Capacity is greater than the IQ score would 
suggest. Marked discrepancies among the 
Wechsler-Bellevue subtest scores are often the 
basis for inferring intellectual impairment of 
either functional or organic origins (Wechs- 
ler, 1958). 

Although the average IQ of a group of 
normal children usually remains constant 
(Brown, 1950; Gehman & Matyas, 1956), 
certain individuals show a marked improve- 
ment and others a marked decline in test per- 
formance during the course of childhood. Such 
variations have been found to be related to 
emotional adjustment (Allen & Young, 1943; 
Clarke & Clarke, 1953; Despert & Pierce, 
1946). It has also been reported that the 1Q 
often rises as a consequence of successful psy- 
chotherapy (Chidester, 1934; Dulsky, 1942; 


Harrower, 1958; Hunsley, 1939; Miller, 1933) 
or other forms of treatment (Fisher, 1949; 
Markwell, Wheeler, & Kitzinger, 1953; Rabin, 
1944). Change to a better environment also 
often leads to an improvement in test per- 
formance (Skeels & Harms, 1948). Children 
in warm, democratic homes were found to im- 
prove in IQ during childhood, whereas chil- 
dren in actively hostile and passive-neglectful 
homes tended to decline in 1Q (Baldwin, 
Kalhorn, & Breese, 1945). The improvement 
in test performance is usually not uniform 
throughout the test. Certain aspects of test 
performance improve more than other aspects- 
Thus Harrower (1958) reports that a rise iD 
Comprehension and Similarities scores on the 
Wechsler-Bellevue is related to clinical im- 
provement in a group of adults, Kessle! 
(1947) found that Picture Completion and 
Comprehension showed a significant rise afte" 
electroshock treatment, and Fisher (1949) 
reports significant improvement in Compre- 
hension, Similarities, and Digit Symbol afte 
EST. 

The present study was carried out to if” 
vestigate the effect on intellectual functions 
of the treatment program for children 2 
Western State Hospital. This treatment P!” 
gram includes care by attendants selected for 
their ability to relate to children with warmt 
and with consistent discipline, classes CO™ 
ducted by teachers specially trained to dH 
with the emotionally disturbed, and parti 
pation in various recreational activities, att 
and crafts. An attempt is made to crea 
a stable, noncompetitive environment with ts 
minimum of stress, In addition, some patie? h 
receive tranquilizers and a few receive p 
chotherapy. Removal from an emotion® t 
disturbing home environment and placem@ 
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Changes in Intellectual Functions of Children 


in a relatively stable environment may be 
considered therapeutic in itself. 

_ These children’s principal intellectual defi- 
ciency was in Verbal IQ, which averaged 
about 14 points below Performance 1Q. 
Therefore, we hypothesized that they would 
improve primarily on Verbal subtests and 
Verbal IQ. Inhibited, compulsively achieving 
children often have a Verbal IQ above the 
Performance IQ. In such children, one might 
expect Performance IQ to rise with clinical 
improvement. The problems of hospitalized 
children, however, differ from those of the 
typical neurotic patients seen in child guid- 
ance clinics, The child in a mental hospital is 
More apt to have a history of delinquent act- 
ing-out and school failure. His poorly con- 
trolled hostile impulses impede his learning in 
School, especially his learning of verbal ma- 
terial. He fails to acquire the normal fund of 
factual information, arithmetic skill, common 
Sense, and reading ability which directly oF 
indirectly is measured by the Verbal section 
of the Wechsler. Children with these charac- 
teristics are often sent to correctional schools. 
Several studies have reported the Verbal IQs 
Of delinquents as below their Performance 
Qs (Bernstein & Corsini, 1953; Wechsler, 
1958). A similar pattern was found for un- 
Successful readers (Graham, 1952). The more 
acutely disturbed delinquent child is fre- 
quently sent to a state hospital. Many of the 
Children in this hospital are either transferre! 
to it from correctional schools or sent to the 
Ospital as an alternative tO correctional 
School, 


METHOD 

all children and adoles- 
o were admitted to 
ertain date were ad- 
1 tests at regular 
cases which had 


ose; (a) who had 
upon ad- 


si Part of another study, 
est up to the age of 18 wh 
Tinte State Hospital after a Cet 
İnter ered a battery of psychologica 
en Out of the group of 40 
been retested, we selected all th 
issi given the Wechsler-Bellevue 
an, ion, between 2 and 3 mont : 
h etween 12 and 24 months after that; h 
betpot been out of the hospital more than BT, 
attended the second and third t > 
Samp “a the hospital school. T 
on of 20 cases. The average age see 10.0 


hs after 


admis 
to „ mission was 13.8 years with a | ie 
Shiz, n years. Diagnostic categories incu Fi 
dep -Phrenics, three organics wit! perier dok 


> ni i 
ine psychoneurotics, a” 
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ders. Two additional cases, which had not had the 
first retest, were added to our sample for our com- 
parisons of test improvement with rated improve- 
ment. Most of the cases in our sample had also been 
given the Bender-Gestalt test, from which was ob- 
tained a Pascal-Suttell 2 score, the Goodenough 
Draw-A-Man, and the Gray Oral Reading Para- 
graphs Test. 

On the Wechsler-Bellevue, in addition to making 
comparisons of IQs and subtest scores, subtests were 
grouped on the basis of Cohen’s (1957, 1959) factor 
analysis of the Wechsler intelligence tests. Scores 
were obtained on four factors he found for the age 
level of our sample. The Verbal Comprehension Fac- 
tor is the average of Information, Comprehension, 
Similarities, and Vocabulary ; Perceptual Organization 
is the average of Object Assembly and Block Design; 
Freedom from Distractibility is the average of Arith- 
metic and Digit Span; and Judgment (Cohen's Ver- 
bal Comprehension II) is the average of Compre- 
hension and Picture Completion. 

In this study, it is assumed that changes in test 
scores after 1 or 2 months reflect practice effects or 
adjustment to the testing situation and to the hos- 
pital environment in general. The changes in scores 
after 12-24 months are assumed to reflect more basic 
changes in intellectual functioning. 
rioral changes were obtained in 


Measures of behav 
order to determine whether improvement in test per- 
formance was accompanied by actual improvement 


in condition. Change in condition was measured by 
a rating scale consisting of 20 variables, each on a 
five-point scale. 

An overall rating of improvement was obtained on 
each patient by averaging the ratings of the specific 
traits. This procedure would only be justified if there 
were considerable homogeneity among the traits 
rated. A homogeneity coefficient was, therefore, com- 
puted by dividing the average between-trait (within 
subject) variance by the total variance, subtracting 
this from 1, and taking the square root. The coeffi- 
cient was found to be .68; it indicates a moderate 
amount of homogeneity—sufficient to justify aver- 
aging the ratings on the individual traits to form a 
composite overall improvement score. 

Each child was rated by staff members who were 
well acquainted with him. Altogether 25 raters were 
used, including 14 ward attendants, 5 school teachers, 
4 social workers, 1 physician, and 1 EEG technician. 
The average number of raters for each child was 7. 

An interrater reliability coefficient was computed 
by dividing the average between-rater (within sub- 
ject) variance by the total variance, subtracting this 
from 1, and taking the square root. The coefficient 
was found to be .52. This is not very high. However, 
the use of many raters tends to counteract the un- 
reliability of the individual raters and thus the mean 
ratings on each subject are believed to be sufficiently 
reliable to serve as measures of clinical improvement. 

Different staff members rated different children ; 
some tended to rate generally high, while others gave 
generally Jow ratings. Therefore, we measured the 
pias of each rater by comparing his overall-improve- 
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TABLE 1 
CHANGES IN TEST SCORES ON RETEST 


Variable M, M: M; t MMe t 
Full Scale IQ 90.15 89.70 — AS 27 
Verbal IQ 81.50 82.45 95 65 
Performance IQ 100.20 98.80 —1.40 61 
Performance-Verbal IQ 18.70 16.35 —2.35 98 
Subtest Total 83.30 88.00 2.93* 
Subtest AD 2.25 58 
Information 4.80 3.45** 
Comprehension 6.10 3.83" 
Digit Span ó. 1:55 
Arithmetic 4. 1.59 
Similarities 7.55 43 
Vocabulary 6.85 32 
Picture Arrangement 9.45 i 
Picture Completion 8.45 3162" 
Block Design 9.10 St 
Object Assembly 12.40 A7 
Digit Symbol 7.80 30 
Verbal Comprehension 6.33 2.17* 
Perceptual Organization 10.75 56 
Freedom from Distractibility 5.40 1.92 
Judgment i 7.27 5.23** 
Bender Gestalt Z 99.55 95.60 232 

Sq M23 is mean score on initial testing. Ma is mean Score on retest 2-3 months after first testing, Ms is mean score OP 


retest 12-24 S after sec esting 
12 AN C after second testing. 
a EAN 


each child by averaging the bias 
of each of his raters, We then obtained corrected rat- 
ings for each child by subtracting the average rater 
bias from each child’s average rating, 
Our sample was small and not normally 
on the psychological test variables ; therefore, we di- 
chotomized the scores on each Variable at the median 
and used nonparametric statistics to compare test im- 
provement with rated improvement. The significance 
level was evaluated with Fisher’s exact test. The de- 
gree of relationship is indicated by the phi coefficient, 


distributed 


RESULTS AND Discussion 


Test Improvement of Group as a Whole 


Table 1 contains the mean scores on the 
test variables on initial testing, retesting 2-3 
months later, and retesting 12-24 months 
after that. It will be noted that on the first 
retest there is a rise of more than 5 points 
(p = .05) on the Performance IQ. This rise 
in Performance IQ results in a tise (p = .05) 
in the Full Scale IQ as well. There is no evi- 
dence of a rise in Verbal IQ. These results 
are consistent with other studies of practice 


effects (Derner, Aborn, & Canter, 1950; 
Hamister, 1949; Hays & Schneider, 1951; 
Steisel, 1951). Because of the increase in en 
formance IQ and the lack of increase in Low! 
bal IQ the already large difference betwee? 
Verbal and Performance IQ in this group Pè 
comes even larger = .05). 

The following ene oie a significant 
improvement on the first retest: Object AS 
sembly (p = 05), Block Design (p = a 
Digit Symbol (p = .05), and Similarities he 
= .05). The only factor to show a rise on t 4 
first retest is Perceptual Organization 3 
= .01). These changes are undoubtedly due 
at least in part to practice effect, Object a 
sembly, Block Design, Digit Symbol, and Pi i 
ture Arrangement were reported as show"? 
the greatest amount of practice effect after 
and 4 weeks for a group of normals (Deri 
et al., 1950). Some of the improvement n 
reflect the stabilizing effect that the hosp! a 
has on these patients during the first ose 
months. This stabilization would reduce 1° jp 
associations and result in an improvement - 
the ability to perceive relationships as ™® 
ured by the Similarities subtest. 
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Because of the marked increase on certain 
Subtests which are already high for this group, 
there is an increase in subtest variability ($ 
=.01). This suggests that one should be cau- 
tious in making inferences of pathology on 
the basis of the subtest variability of patients 
Who have been tested before. Because prac- 
lice has a greater effect on the Performance 
IQ than on the Verbal IQ, one would usually 
expect to find Performance IQ higher than 
Verbal IQ for individuals who have been 
tested previously. 

After a period of 12-24 months there is an 
ppreciable rise in the subtest total (p = .05), 
but this does not result in an improvement in 
IQ because by that time the children fall in a 
different age bracket and require a higher sub- 
test total to achieve the same IQ. After 12-24 
Months the greatest rise occurs on the Infor- 
mation, Comprehension, and Picture Comple- 
tion subtests (p = .01). 

he scores on two of Cohen’s factors show 
à significant rise after 12-24 months. The ae 
Provement in the Judgment factor is highly 
Significant (p= 01). The improvement on 
the Verbal Comprehension factor was smaller 
p= 05). A small rise on the Freedom a 
Distractibility factor approaches significance. 
t is interesting to note that the subtests 
and factors showing the most improvement 
after 1224 months were not the ones show- 
ng improvement after 2-3 months. 


Pest Improvement Related to Ratings of Im- 


“ovement 


. 7 = 55 
There is a significant relationship gh 1 10 
Zz 05) between improvement 1n Verba 


anq ratings of general improvement. Children 
Showing more than the median amount n a 
Provement went up 4.8 points 1m has the 
Vhereas children who showed less than 


ian a vent down 1.7 points. 
mount went ted improve- 


© relationship between ra i 
att and Full Scale IQ was smaller but was 
“9 significant ( = .36, p = 05): 


: jonship be- 
ere was no significant relations p 


tw, d improvement 


in “en clinical improvement an ; 

erformance IQ (# = -10, $ = 7% + Dièit 
Span Přovement on only one subiet oeral 
Cling? Was significantly related ee Tne 
for ical improvement (® = 46, ? = ied sig- 
if ation and Comprehension appro? erat 


Cance (p=ns). 


Changes on the Goodenough IQ and the 
Bender-Gestalt did not seem to be related 
to clinical improvement. Improvement on 
the Gray reading test was, however, related 
to rated improvement (V = 12, ®= .66, p 
= .05). Those rated as improving more than 
the median amount went up 1.5 years in read- 
ing level while those improving less than the 
median amount went up only .6 years. 

A comparison of each of the 20 rated vari- 
ables and the psychological test variables 
showed the following significant relationships 
(p = .05 unless specified) : 

1. Improvement in Verbal IQ was related 
to improvement in most of the 20 clinical 
variables but only the following relationships 
were statistically significant beyond the .05 
level of confidence: Achievement in School, 
Development of New Interests and Goals, 
Ability to Concentrate and Resist Distrac- 
tions, Reduction in Anxiety. 

2. Achievement in School and Develop- 
ment of New Interests and Goals were signifi- 
cantly related to improvement on the Infor- 
mation and Comprehension subtests. Reduc- 
tion in Anxiety was significantly related to 
improvement in Digit Span. Improvement in 
ependability was related to improvement on 
the Arithmetic subtest (® = .55, p= .05). 
Improvement in Ability to Concentrate and 
Resist Distractions was related to improve- 
ment in Digit Span and the Gray reading test 
(p = .56, p= 05). 

3. Improvement on the Digit Symbol sub- 
test was significantly related to Decrease in 
Bizarre Thought Processes and also to De- 
crease in Delinquent Tendencies. 


SUMMARY 


This study was carried out to determine the 
effect of a hospital program on the intellec- 


tual functioning of emotionally disturbed 


children. , , 
A sample of 20 children with a mean age 


of 13.8 years was tested on admission to the 
hospital, retested 2-3 months later, and re- 
tested again 12—24 months after that. 

A significant rise occurred after 2—3 months 
on Wechsler Performance IQ, Full Scale IQ, 
subtest total, subtest average deviation, Simi- 
larities, Block Design, Object Assembly, and 
Digit Symbol, and the Perceptual Organiza- 
tion factor. This improvement is partially at- 
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tributable to practice effect, but the improve- 
ment in Similarities perhaps reflects a decrease 
in bizarre or irrelevant thought processes. 

After 12-24 months, a marked improve- 
ment was shown on the Information, Compre- 
hension, and Picture Completion subtests; 
the Judgment and Verbal Comprehension 
Factors; and Bender-Gestalt performance. 
The group as a whole appears to be better 
organized perceptually, to have more com- 
mon sense, better judgment, and an increased 
ability to perceive relationships and distin- 
guish between essential and unessential as- 
pects of a situation. 

The Performance, Verbal, and Full Scale 
IQs did not go up for the group as a whole. 
However, it was noted that certain patients 
did improve considerably on these scales while 
others declined. It was hypothesized that such 
differences in the direction of change would 
be related to improvement or deterioration of 
the patient’s condition. 

Ratings of improvement on 20 clinical vari- 
ables were obtained from the hospital staff. 
Improvement in Verbal IQ, Full Scale IQ, 
Digit Span, and Gray Oral Reading test level 
was significantly related to overall clinical 
improvement, Improvement in IQ and in spe- 
cific subtests was found to be related to im- 
provements on specific traits. 

It is concluded that most hospitalized chil- 
dren have problems which cause them to be 
retarded in verbal skills; as they improve, 
their Verbal IQ rises. It is suggested that the 
initial Verbal TIQ is not a fair estimate of the 
intellectual capacities of emotionally disturbed 
children, and that Performance IQ may pro- 
vide a more accurate measure of these chil- 
dren’s actual intelligence. 
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A COMPARISON OF SOCIAL AND SOLITARY 
MALE DELINQUENTS'* 


MARY H. RANDOLPH, HAROLD RICHARDSON, axo RONALD C. JOHNSON 
San Jose State College 


ie contemporary theories concerning the 
2 ses of juvenile delinquency might be 
ey differentiated into two areas of em- 
X asis: psychologically oriented theories, such 
those expressed by the Gluecks (1950), 
caly and Bronner (1936), and Lindner 
nol! and sociologically oriented theories, 
(194 as those expressed by Shaw and McKay 
and 2), Sutherland (1955), Thrasher (1936), 
end others, Both points of view may be de- 
an me since it seems likely that psychological 
gre Sociological forces interact to varyins de- 
T. in the histories of most delinquents. 
ern desmith and Dunham (1941) have dif- 
Na aea between the socialized and the in- 
ciali ualized criminal, They state that the so- 
alized criminal is one who commits crimes 
Fi are supported and prescribed by his cul- 
erim so that, by committing & crime, m 
Soci ma gains in status and recognition. ; e 
coll ized delinquent or criminal acts Im, c ee 
De Aboration with other persons and is °C” 
‘dent on them for the continuation of his 


Crim; 3 Ena] 
nal career. The individualized criminal, 
On th Ari s that are 


Pers : 
o s mes 
l val and private. He commits-his ctl 


W iy and, in theory, is a strange! : 
is , Commit similar crimes. His criminal ar 
Socia] an acceptable form of penang oe ns 
likely milieu. The socialized criminé a 4 
Psych, to be a rather normal person, ia 
i : deviant soc 
indiss common to his subcu è 
Dri s ualized criminal, at odds W1 
idua] A Pips seen likely to 
a 2 of deep momar pressures 
er psycholog f 
etn 2949) has r egetei that the solitary 
1q tent is an individual with 3 
s ti Teport is based on a thesis submitted (P7 
a Jose Ror) to the Department of Psyc 
tate College, January 1960. 


“conscience 


defect” unconsciously fostered by the parents, 
while the social delinquent is the product of 
a subculture with delinquent values. Bloch 
and Flynn (1956) made similar statements. 
Within this framework, the sociological theo- 
ries would seem most useful in explaining the 
delinquency of male social delinquents—gang 
members and others who commit delinquent 
acts in the company of others, while psycho- 
logical explanations might best account for 
the individualized or solitary delinquent. 

It has been found (Hewitt & Jenkins as re- 
ported by Wattenberg & Balistrieri, 1950) 
that juvenile gang members are likely to come 
from homes of a lower socioeconomic stratum, 
while nongang members showed more indica- 
tions of coming from stressful or depriving 
homes of a middle socioeconomic level. John- 
son (1950) found that solitary delinquents 
were far more often recidivists than were so- 
cial delinquents, even though the majority of 
his social delinquents were members of well or- 
uvenile gangs. This finding might be 
if individual or solitary delinquents 
ly acting out symptoms of deep- 
esolved psychological stresses. 

Beyond these scanty data, little is known 
about genotypic or phenotypic variation be- 
tween solitary and social delinquents, even 
though treatment techniques used for the two 
groups might be made more effective if this 
information were available. The purpose of 
this study is to compare solitary and social 

th regard to several “sociologi- 


delinquents wi i j 
cal” Ba “psychological” variables. 


ganized j 
expected 
are mere! 
seated and unr 


METHOD 


Subjects 

The sample consisted of 62 boys, aged 14-18, who 
had been adjudged legally as juvenile delinquents. 
Of these, 52 subjects were at a “ranch” for delin- 
quent boys and the other 10 were in custody, await- 
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TABLE 1 


Semper OF SOLITARY AND SOCIAL DELINQUENTS FROM 
Eacu SOCIOECONOMIC LEVEL 


Upper Lower Upper Lower 

Delinquent Middle Middle Lower Lower N 
Solitary 4 z 5 2 1s 
Social 1 4 15, 19 39 


Note.—x? =15.83, p < 01. 


ing placement at this ranch. All boys were of at least 
dull normal intelligence. Of the original sample, one 
subject was climinated because of insufficient re- 
sponses to test items, and four subjects were elimi- 
nated because extensive further examination showed 
them to have had mixed (solitary and social) delin- 
quent careers. Fifty-seven subjects remained. Of these 
subjects, 39 had always been social and 18 had al- 
ways been solitary in their known delinquencies. 


Measuring Devices 


Each subject was administered a Wechsler Adult 
Intelligence Scale (WAIS), and a Minnesota Multi- 
phasic Personality Inventory (MMPI). Socioeco- 
nomic status was determined by a local adaptation 
(Hodges, unpublished) of the Warner Index (Warner, 
1949). 


Procedure 


All subjects were tested inside an institutional set- 
ting. Tests were administered and scored according 
to standardized procedure except that all questions 
on the MMPI were read aloud to subjects, while the 
subjects read the questions in the booklet, in order 
to minimize difficulties in comprehension. All subjects 
knew that test results were confidential and would 
not influence placement. Only two MMPI records 
(both of social delinquents) had to be discarded as 
invalid. 


RESULTS 


On the WAIS IQ scores the social delin- 
quents had a mean of 93.23 with a standard 
deviation of 9.14. The solitary delinquents 
had a mean of 105.00 with a standard devia- 
tion of 11.19. The ¢ test of differences be- 
tween these means was significant beyond the 
01 level. The F test of differences between 
variances was not significant. The mean IQ 
score of the solitary delinquents was exceeded 
by only 15% of the social delinquents. 

i Many more solitary delinquents came from 
upper socioeconomic levels than did social de- 
linquents, as shown in Table 1. 

MMPI profiles are presented in Figure 1. 

Mean differences between social and soli- 


tary delinquents on the validating scales L, 


M. H. Randolph, H. Richardson, and R. C. Johnson 


Code Noo 1 234567890 
Hs D Hy Pd Mf Pa Pt Sc Mo Si 
==-=Solitary N= 18 
—Social_N= 37 


T Score 


Solitary = ------ 
Sociol =—— 


Fic. 1. Mean profiles for social and solitary 
delinquents on the MMPI, 


F, and K were not significant. Both groups 
scored rather high on F, found to be an indi- 
cator of psychopathology (Kazan & Schein- 
berg, 1945; Modlin, 1956) and of delinquency 
(Hathaway & Monachesi, 1953). Profiles of 
the two groups are similar but solitary delin- 
quents, as a group, appear somewhat more 
disturbed. All mean differences for the diag- 
nostic scales were significant except for the 
Ma scale. Differences on Mf, Pa, and Si were 
significant beyond the .05 level. Differences 
on Hs, D, Hy, Pd, Pt, and Sc were signifi- 
cant beyond the .01 level of confidence. The 
code type of the solitary delinquent was 
8479'612305—, (4, 13, 11). The social de- 
linquent’s code type was 'S497613 -, (3, 11; 


TABLE 2 


Data COMPARING SOCIAL AND SOLITARY 
DELINQUENTS ON THE MMPI 


(Social N = 37, Solitary N = 18) 


a - = 
Mean Mean SD SD 

Scale Social Solitary Social Solitary t 
a 

L 345 3.55 242 204 15! 

F 11.23 12.76 458 455 149 
K 1045 10.94 397 662 68, 
Hs 14.03 17.05 310 403 297%, 
D 1877 2194 308 471 3.23%) 
Hy 19.21 23.33 435 447 310, 

Pd 2762 3089 438 463 305 
Mfo 2218 418 4n 23% 
Pa 1236 295 327 222a 
Pi 3026 595 629 3.37, 

Se 3341 761 793 354 
Ma 2418 417 470 18, 

Si 28.85 609 827 248 
g 

w scores. 


Note.— These figures are expressed in MMPI rav 
F- Significant at .05 level. 
* Significant at .01 level. 
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10 

a . oe beth types had high excitors 

Pees Ma), only the solitaries had high 

ing inte (Si, D, Mj), presumably indicat- 

Se ic trends. Complete statistical data 

ae pacing social and solitary delinquents 
e MMPI are given in Table 2. 


DISCUSSION 


I : 

t a A relatively clear, from these results, 
siderably K and social delinquents differ con- 
inquent rom each other. The solitary de- 
Sadike likely to come from a higher 
lectual ae level and to be of higher intel- 

eas ility than the social delinquent, but 
findings nsiderably more maladjusted. These 
upon pi might explain why, once embarked 
Solitary | oc of delinquent behavior, the 
recidivi elinquent is more inclined to be a 

ist. 

a aRSies between the two groups of de- 

luents might be taken into account 1n diag- 


nos t 
is, prognosis, and treatment. The prog- 
without some 


fee for the solitary delinquent v 
'M of therapy seems likely to be poorer 
i social delinquents- Current socio- 
Mor, y oriented treatment techniques might 
€ often be sufficient for the rehabilitation 
ai social delinquent. So long as therapy 
cal S are in short supply, a more economi- 
use of therapists’ time might result from 
bena ideration of the obtained differences 
cen social and solitary delinquents. 


SUMMARY 
d from 57 delinquent 
at the time of 
itional setting. 


tee data were obtaine 
Stin aged 14-18, all of whom, 

rd Were within an institutional Seu 

li ag of the subjects were | social de- 
tte who committed their crimes in the 
«pany of others. Eighteen subjects were 
ein ty, delinquents” who had committed 
Ence delinquencies alone. Wechsler intelli- 
test (WAIS) scores indicated that soli- 


tar 5 
Soci delinquents are significantly higher than 
tary delinquents in intellectual ability. Soli- 
R soc, Aquents were also significantly higher 
lin uen nomic status than were social de- 
Vere a MMPI profiles of the 
“lev lat, but with a significa 

On in all scales except Ma 


two groups 
ntly greater 
for the soli- 


tary delinquent. The solitary delinquent ap- 
pears more likely to be a psychologically de- 
viant individual who comes from an ostensibly 
normal environment, while the social delin- 
quent seems far less deviant, in a psychologi- 
cal sense, but comes from an environment 
where certain sociological factors, presumably 
causal to delinquency, are operating. Certain 
implications of these findings were discussed. 
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THE DIMENSIONALITY OF RATINGS OF THERAPIST 
VERBAL RESPONSES? 


EDMUND S. HOWE anp BENJAMIN POPE 


University of Maryland School of Medicine 


The last few years have witnessed an im- 
portant trend in research in psychotherapy, 
toward study of the therapist as an independ- 
ent variable in the dyadic relationship. This 
trend has in part shifted the focus of empiri- 
cal attention away from such issues as theo- 
retical differences per se among “schools” of 
psychotherapy, and has instead directed re- 
search toward rigorous quantification of basic 
variables cutting across theoretical and prac- 
tical divergences among therapists. Among the 
most penetrating studies of this kind are those 
investigating the dimension and the dimen- 
sionality of Depth of Interpretation (e.g., 
Harway, Dittmann, Raush, Bordin, & Rigler, 
1955; Raush, Sperber, Rigler, Williams, Har- 
way, Bordin, Dittmann, & Hayes, 1956: 
Speisman, 1959). Other investigators have 
approached presumedly different aspects of 
therapist verbal behavior such as Directive- 
ness (e.g. and Ambiguity 
(e.g., Osburn, 1951), to mention but two. 


i ; out of research supported by 
Pilot Evaluation Grant No. 2M-6408 from the Na- 


Health of the National In- 
States Public Health Serv- 


Activity. Using attributes of “Ambiguity,” 
“Lead,” and degree of “Inference” as aspects 
of the concept of Therapist “Activity,” an 
Activity Scale was constructed on the basis 
of ratings, by psychiatrists, of a representa- 
tive sample of 50 therapist verbal responses. 
The order of reliability observed among rat- 
ings used in constructing the Activity Scale 
and ratings obtained from application of the 
scale was about .50. While this average reli- 
ability coefficient compares quite favorably 
with those reported by earlier investigators of 
a different aspect of therapist verbal behav- 
ior (e.g., Harway et al., 1955; Raush et aly 
1956) 75% of the total variance is neverthe- 
less left as unexplained “error.” The rather 
low reliability indices obtained in these and 
other studies is quite possibly due.to the any 
sumed “dimension” in rating studies beith 
multiple, rather than unitary; as Raush et al- 
(1956) earlier pointed out. For, as Coombs 
(1951) and Bordin, Cutler, Dittmann, Har 
way, Raush, and Rigler (1954) have written: 
it is quite possible to “force” unidinan a 
ality even where it does not mathematically 
exist. Indeed, even while the studies of thera 
pist activity were being performed, it beci 
subtly obvious to the experimenter that g: 
resistance of some of the psychiatrists to pe 
formance of the rating task devoid of a aea 
and-blood patient was at least in part cue 
an apparent confounding of their evaluat 
attitudes with their judgments of activity ee 
cording to relative ambiguity, lead, and a 
ence involved in each therapist response. v if 
this the case, then one would statistically H 
dict relatively low reliability among rat! 
along a one-dimensional continuum. der 

The study reported here was thus ur 
taken to explore the dimensionality of n 5 
of such types of therapist verbal response 
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Dimensionality of Ratings 


had been used in the earlier presumptive one- 
dimensional studies of therapist activity. Ap- 
plication of a 40-scale semantic differential to 
therapist responses thus facilitated a check on 
the two general propositions that such ratings 
of therapist verbal responses would be pri- 
marily of an evaluative nature, and at least 
two-dimensional. 


METHOD 


Raters, A decision was made to solicit the services 
of Board-certified or Board-eligible psychiatrists, 
Tather than of psychiatrists having had some mini- 
mum amount of therapeutic experience. The sub- 
iets were drawn from the Psychiatric Institute at 

© University of Maryland School of Medicine, 
om those engaged in full-time private practice in 
Baltimore City, from the National Institute of Men- 
a Health, from the Walter Reed Army Hospital 

Institute for Research, from Chestnut Lodge, 

T from Spring Grove State Hospital, Maryland. 
he booklets described below were mailed, after ver- 

R agreement by telephone, to 50 subjects. Of the 
in Soklets returned,? 3 were discarded because of 
advertant omission of at least 1 page, either by ex 

35 Menter or by subject. Data from the remaining 

> Subjects were analyzed. A 
Spo: terapist Verbal Responses. Ten therapist re- 

nses from the set of 50 used in the earlier rating 
hee Were selected for experimentation. These a 
Clove. each chosen for specific reasons of ane 
Simi] z Priori similarity to or extreme a priori dis- 
sented S from at least one other response, me pE 
Sum in Table 1. These reasons are now pa 
stugy nized. Responses 1 and 10 were selected on 
exten „because they were originally given the ns 
Eor mean Activity Level (AL) ratings. Re- 
Origi: es 2 and 3 were selected because they were 
lo E rated equally, and were clearly of a wy 
Were een facilitating nature. Responses 4 and i 
and tXewise chosen since both were about equally 


t f an 2 
an yet Somewhat more specifically focused than 


Unique and both still quite low-active. Response e F 
highly M the set of 10 studied here, since ir Bi 
8 Wor, Specific, objective question. Responses | a 

SPonge. included since they constitute interpretive a 
ALs, p f subjectively equal depth, and have equa 
; “tsponse 9 is evidently a reassurance/supportive 


ty 

Pe RR ae i 

OVW, response, subjectively quite distinctive in its 
A right, i i 


Sa the 10 responses could clearly be used as 
delibera fhe potential disadvantage of making a 
Wag 7, ate selection (with the attendant risk of bias, 
Coulg WS°lY offset by the definitive role the PRS 
Uteo, play in the face validation of certain empirical 
Resp “S Lo be described later. The question whether 
— 10 (and perhaps even Response 2), or- 


Minamas thanks are here expressed to all of ae 
the t Cd persons who were kind enough to pertor 
lings, 
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TABLE 1 


THE 10 THERAPIST VERBAL RESPONSES PRESENTED TO 
THE RATERS AND ORIGINAL MEAN Activity 


LEVELS 
No. Therapist Verbal Response AL 
1 Hm-hm 1.4 
2 And —? 2.4 
3 What’s been happening? 2.6 
4 By brooding you mean —? 4.3 
e Your teart —? 4.4 
6 How much do you earn? 6.2 
7 Perhaps he feels attracted to 7.8 
you 
8 Maybe you participate in this, 7.8 


encouraging them to depend on 
you move than you think. 

9 I hope you don’t think ’mim- 8.7 
patient with you, because I’m 
not 

10 You have to tell me whether 9.7 
it is so or not. 


Note.—Based on the ratings of 30 psychiatrists during the 
first, 50-item rating study. 


mally occur too infrequently to be included without 
risk of seriously biasing the empirical outcomes in 
this type of research can not be fully dealt with here, 
but it deserves comment. Response 10 (“You have 
to tell me whether it is so or not”) is essentially a 
demanding and persuasive operation toward the pa- 
tient. As such, it is largely a tabooed response, nega- 
tively valued by most therapists. (A comparable, 
though weaker argument might be made by some 
therapists with regard to Response 9, which is essen- 
tially a reassurance operation instigated by the thera- 
pist himself.) Operations of a persuasive nature may, 
however, take many and variegated forms. Persuasion 
qua persuasion rarely occurs in most accepted psy- 
chotherapies. But the type of persistence and hound- 
ing of the patient’s thoughts that occurs in the pub- 
lished interviews of Deutsch and Murphy (1955), to 
cite only one example, reflects from an operational 
standpoint a respectable unwillingness of the thera- 
pist to let the patient “get away” until he reports 
that which the therapist, perhaps unconsciously, 
wants to hear. Thus, if the concept of persuasiveness 
be regarded more as an “attitude of mind” in the 
therapist, than as the manifest form that his com- 
munications to the patient actually take, then it be- 
comes more reasonable to include some sort of verbal 
stimulus connoting such a therapist attitude. Indeed, 
inclusion of such an anchor stimulus would, under 
the foregoing conditions, be as like to reduce as to 
foster bias in the empirical outcome. 
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TABLE 2 


Tue SEVEN SETS OF BIPOLAR, 
ADJECTIVAL SCALES 


Hypothetical 


Variable Bipolar Scale 


Spacious-Constricted 
Colorless-Colorful 
General-Specific 
Unfocused-Focused 
Commonplace-Unique 
Vague-Precise 


Ambiguity 


Cold-Warm 
Calm-Excitable 
Sober-Drunk 
Relaxed-Tense 
Cautious-Rash 
Relieving-Painful 


Stressfulness 


Subtle-Obvious 
Inferential-Logical 
Intuitive-Rational 
Deep-Shallow 
Private-Public 


Inference 


Lead Following-Leading 
Accepting-Demanding 
Conforming-Directing 
Evaluative Skillful-Unskillful 
Reputable-Disreputable 
Wise-Foolish 

Good-Bad 
Accepting-Rejecting 
Sensitive-Insensitive 
Acceptable-Unacceptable 
Valuable-Worthless 
Activity Still-Vibrant 
Static-Dynamic 
Slow-Fast 
Inert-Energetic 
Passive-Active 


Muted-Blatant 
Weak-Strong 
Soft-Hard 
Thin-Thick 
Far-Near 
Small-Large 
Dull-Sharp 


Potency 


Note.—The classification is arbitrary in several cases. 


The Rating Booklet. Each rater was presented with 
a 20-page booklet. Each successive pair of pages 
contained a total of 40 seven-point, bipolar, adjec- 
tival scales, the therapist response to be judged ap- 
pearing at the top of each pair of pages. The 40 
scales were of course the same for each response, and 


Edmund S. Howe and Benjamin Pope 


they appeared in the same order. The order in which 
the therapist responses appeared, however, was varied 
in four ways (viz: Numbers 1-10; Numbers 10-1; 
Numbers 6-10, 1-5; Numbers 5-1, 10-6). In all 
other respects the format of the instructions to the 
subject followed that described by Osgood, Suci, and 
Tannenbaum (1957). The subject was instructed to 
assume that each response was made during an ini- 
tial interview. It was considered desirable to permit 
the subject to project his own feelings about con- 
text, since the generality of the findings would there- 
by be enhanced. 

The Adjectival Scales. The set of 40 scales was 
selected after an exhaustive examination both of 
Roget’s Thesaurus, and of published work (egs Os- 
good et al., 1957) using different forms of the Se- 
mantic Differential. It was decided to include scales 
having some connotative reference to Ambiguity» 
Lead, and Inference, since these attributes had bee? 
used earlier as rough working referents of the con- 
cept of Activity. Scales having connotative reference 
to Stressfulness were also included because of such 
frequent explicit claims by previous subjects as, 
would consider something more ‘Active’ if it ten! is 
to upset the patient.” Finally, in view of their ubiqu- 
tous appearance in numerous reported studies, 8° 3 
were included having established relevance to, 
good’s Evaluative, Potency, and Activity dimensions: 
These seven sets of bipolar adjectives are presente 
in Table 2. The present classification of individus” 
items into the seven sets is, of course, purely arbi 
trary in several instances. re 

Treatment of Data. A 40 X 40 intercorrelation a 
trix, with V =35 (Judges) X 10 (Verbal Response’ 
= 350 pairs entering into each correlation, was vi 
tained by IBM. The matrix was factored by 
complete centroid method of Thurstone (1947), per 
since a maximum of seven hypothetical dimensi e 
was implied by the initial categorization of He 
scales into seven sets, a total of nine factors was oa 
tracted, of which three were significant and int 
pretable. The centroid was orthogonally rotate 
the quartimax method (Neuhaus & Wrigley, 1 


he 


934): 


RESULTS 
The Factor Analysis 


The rotated factor loadings are presented 
in Table 3. The first factor to emerge accoun Í 
for 33% of the total variance and 60% se- 
the common variance. It is most clearly ac’ 
fined by the bipolar scales foolish-wise, vod 
ceptable-acceptable,  skillful-unskillful, r 
bad, and valuable-worthless. The first oad 
scales have rather pure, high negative os 
ings and the last three rather pure, high P ae 
tive loadings on the first factor. Other, Sin $ 
what less pure but still significant Dant 
are observed for scales tense-relaxed, bevel 


eine A gat 
muted, and rejecting-accepting (neg 
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TABLE 3 
Rotated Factor Loapincs 
Scale I II mI ie 

Tense-Relaxed —.83 =.16 03 
Skillful-Unskillful 87 —.02 — .04 
Hard-Soft —.78 —.26 —.06 
Passive-Active 38 -65 —.03 
Near-Far 252 — 34 01 
Reputable-Disreputable SO —.01 — 03 
Spacious-Constricted -60 11 13 
Foolish-Wise —.90 02 10 
Cautious-Rash 83 27 —.03 
Blatant-Muted —.83 =.28 =08 
Colorless-Colorful 09 a7 —.09 
Still- Vibrant AT 7 —.10 
Leading-Following 225 =52 04 
Small-Large 13 54 —.09 
Accepting-Demanding 75 16 20 
Strong-Weak 15) 73 —.05 
Specifie-General —.20 —.13 
Good-Bad 90 —.05 
Rejecting-Accepting —.82 —.11 
Static-Dynamic —.32 = 17 
Conforming-Directing 38 07 
Thick-Thin —11 10 
Focused-Unfocused —.17 —.06 
Slow-Fast 19 =e 
Obvious-Subtle 61 30 

i nsensitive 84 Jd 
i table 83 00 
Energetic-Inert —.16 02 
Cold-Warm —.ól —.09 
Inferential-Logical 16 a4 
Tnique-Commonplace —.08 -28 
 ague-Precise 07 ld 
~nacceptable-Acceptable —.88 08 
Relieving-Painful dl 09 
Sharp-Dull = .05 
Sober-Drunk 1 =:i4 
Shallow-Deep —.61 = 1h 
Private-Public 37 16 
Intuitive-Rational 4 78 
“aluable-Worthless 89 —.08 

Xa? = 13.38 1.66 XJ = 22.19 


Se 


loa 

ted, and for scales cautious-rash, sensi- 
loadeqy sitive and calm-excitable (positively 
first i here is no question but that the 
as ong “lor may be appropriately interpreted 
mp ies i Professional Evaluation. Its nature 
'S thoro, tt the “good” therapist, the one who 
Pongeg Sily reputable and skilled, uses re- 
Cee i Which are cautious, relaxed, muted. 
of su ng, Sensitive, and calm. The emergence 
Shay Se à factor first, further implies, as we 
®, that such evaluative connotations of 


the verbal responses are more compelling, 
more salient than are the connotations of the 
second and subsequent factors interpreted, 
The second factor accounts for 18% of the 
total variance and 32% of the common vari- 
ance. It is most clearly defined by scales color- 
less-colorful, still-vibrant, and vague-precise 
(positive loadings), and by scales energetic- 
inert and strong-weak (negative loadings). 
Other, somewhat less pure but still signifi- 
cant loadings on the second factor are ob- 
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Fic. 1. Scales having significant rotated loadings on 
Factor 1, or on Factor 2. 


served for scales passive-active and slow-fast 
(positive loadings) and for scales specific-gen- 
eral and focused-unfocused (negative load- 
ings). The second factor is interpreted as one 
of Precision/Potency, although Ambiguity / 
Passivity would be almost as appropriate a 
label. This factor clearly refers to those at- 
tributes of therapist behavior variously re- 
ferred to as Activity, Ambiguity, and the like. 
Its nature is reminiscent of the “dynamism 
factor” so labeled by Osgood et al. (1957) to 
describe the apparent coalescence of their 
second (Activity) and third (Potency) fac- 
tors in the judgment of sociopolitical concepts. 

The third factor, which accounts for only 
a little over 4% of the total variance and 7% 
of the common variance, is represented by 
only two scales: inferential-logical and intui- 
tive-rational, both being quite highly posi- 
tively loaded on this factor. The rotated load- 
ings for these scales are, respectively, .74 and 
.78, and they are fairly pure scales; but there 
are no other scales even approaching signifi- 
cant loadings on this factor. Consequently, 
it is rather difficult to interpret this factor 
with great conviction; for it is probably more 
a factor of Subjectivity/Objectivity rather 
than one of inference. The difficulty in in- 
terpreting this factor reflects in part a se- 
lection of scales of which the referents, retro- 
spectively considered, are somewhat nebulous. 
The fourth through sixth factors are all sig- 
nificant (p < .05) according to Humphreys’ 
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Rule (Fruchter, 1954, pp. 79-80), the respec- 
tive percentages of total variance accounted j 
for being 3 for the fourth factor, 2 for the 
fifth, and 1% for the sixth factor. None of 
these factors, however, makes any interpretive 
sense. 


Graphic Representation of Factors 1 and 2 


A number of scales with high loadings on 
the first two factors are plotted two-dimen- 
sionally in Figure 1, for illustrative purposes. 
The axes in the Figure are drawn at right 
angles since the Quartimax rotation leads t0 
an orthogonal solution. The first dimension, 
that of Professional Evaluation, in a sense 
sets the image that the raters have of “good” 
professional behavior; that is, responses in- 
volving connotations of acceptance, sensitive- 
ness, relaxedness, muteness, calmness, caution, 
and so on. The second dimension, Precision, 
Potency, is clearly more allied, as noted ea- 
lier, to the original concept of Activity, 2” 
its assumed attributes of Lead and Ambiguity- 


Construction of the Three-Dimensional Model 


e 
” 


A model representing the positions of th 
10 therapist responses in “semantic spac? 


moet m 
rap? 
Fic. 2. Three-dimensional model of the 10 the? 


responses in “semantic” space. | 


Eo o 


Dimensionality of Ratings 301 


Was constructed ® from the Generalized Dis- 
tance Formula (Osgood et al., 1957, pp. 90- 
97). For this purpose the raw scores on three 
Pairs of scales were used to compute values 
of D. The three pairs of scales were selected 
On the basis of their having high, significant. 
ey pure loadings, respectively, on 
ie os 1, 2, and 3. The scales were: foolish- 
a and valuable-worthless on Factor 1, 
i Aer elor] and energetic-inert on Fac- 
ration oy inferential-logical and intuitive- 
tne as m Factor 3 (see Table 3). Follow- 
aa “ine S procedure, the model was repre- 
toa ee 10 rubber balls 1 inch in diameter, 
3 “eo in three-dimensional space with 
Propri = rods of ! inch gauge, cut to ap- 
apes lengths. The model is presented in 
Was ex - Slight but not at all serious difficulty 
exacth es in fitting all of the distances 
oun t Us suggesting that three dimensions 
quite well for the obtained data. 
Scere pe earlier presented the 10 verbal re- 
a sti and a rationale for their inclusion in 
by and d Was given. Figure 2 illustrates that, 
arge, a priori expectations are borne 
i and the model thus has considerable face 
ity, I 
“Hm-hm,” while Response 10 
a rare attempt to persuade the 
- These two responses are the most 
i from each other in the model. Re- 
tati a 2 and 3 are similar, low active facili- 
close to Ponses, and they are appropriately 
and «s each other in Figure 2. Responses 4 
focu refer to more sharply (but equally) 
ui facilitating responses, and they also 
oe close to each other in the model. 
tive , 5€ 6 refers to a highly specific, objec- 
it is Presumedly uncharged) question ; 
"athe; ‘que in the set of 10, and accordingly 
Othe, ,Selitary. Responses 7 and 8, on the 
0 ee constitute interpretive operations 
Sent in ane moderate depth, and are ad- 
da Sita, igure 2. The ninth response refers 
dise rimi } ortive/reassurance operation, clearly 
fron inated in the three-dimensional model 
Sie X er responses. The relative positions 
large] 10 balls in the model thus accord 
With purely subjective clinical “feel,” 
S are acknowledged to Michael S. Black, 
niversity of Illinois, for painstakingly 
8 this model. 


and agree very well with empirically observed 
groupings of the responses found in the ear- 
lier one-dimensional study. 


DISCUSSION 


The general findings are reminiscent of two 
earlier studies. One, published by Fisher 
(1956), concerned ratings of “plausibility” 
versus “depth” of interpretive responses. 
Fisher showed strong, significant relationships 
between ratings of depth of therapist inter- 
pretive operations, and ratings obtained when 
the working dimension was Plausibility. The 
findings of the present study independently 
suggest Fisher’s results to be extremely plau- 
sible! Presumably there is a (theoretically) 
infinite number of “one-dimensional” scale 
labels that would give approximately compa- 
rable results in any such rating study.* This 
consideration argues, of course, for the hy- 
pothesis that regardless of what instruction 
one gives to a rater, he will, in the final analy- 
sis, rate according to certain internal mediat- 
ing cues which only partly correspond with 
whatever explicit cues the experimenter is try- 
ing to communicate. Fisher has drawn atten- 
tion to the essence of the problem here in- 
volved. The present findings rather forcefully 
suggest that upon closer inspection, some of 
the dimensions of therapist verbal behavior 
frequently studied empirically may turn out, 
operationally speaking, to be one and the 
same. 

A second finding of which the present ones 
are reminiscent was published by Raush et al. 
(1956). While those authors concluded that 
on the whole they could not find evidence of 


4 As a matter of fact, a set of 35 abstract descrip- 
tions of therapist verbal responses were recently 
sorted here by completely naive, unsophisticated, 
freshman undergraduates, along “any increasing di- 
mension that you think would be appropriate.” The 
rank orderings of these therapist responses sorted by 
17 subjects showed a rho of .87 with ranked mean 
ratings (of the same 35 responses) made by 20 Board- 
certified psychiatrists given a working definition of 
Activity Level in terms of Ambiguity, Lead, and In- 
ference, and asked to sort on this dimension! These 
data are unpublished. They lead one to the opinion 
that there is a rather basic cultural uniformity in 
discriminatory reactivity to verbal statements, prob- 
ably because verbal statements both define and re- 
flect the fundaments of a relationship between two 
people. 
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multidimensionality in ratings of their Depth 
of Interpretation data, one of their studies, 
nevertheless, did yield three-dimensionality. 
The first dimension was clearly one of depth; 
the second was called Ambiguity; while the 
third was not identified. The authors later con- 
cluded that the second dimension was specific 
to the raters and materials employed. While 
the adjectival scale deep-shallow used in the 
present study is in no real sense equivalent 
to depth as defined by Raush et al., it is none- 
theless interesting to note that such a scale 
is clearly more highly loaded on the Profes- 
sional Evaluation factor than on the Pre- 
cision/Potency factor (see Table 3, Line 37). 
Were there any relationship between the deep- 
shallow scale and the defined concept of 
depth, then such would support the hypothe- 
sis that under at least some conditions the 
depth dimension is perhaps one of evaluation, 
rather than one of ambiguity or precision. 
These similarities are, of course, only pe- 
ripheral. It is also to be noted that Osburn 
(1951) found no evidence of multidimension- 
ality among his ratings of Ambiguity. 

The results confirm the prior impression 
that raters do indeed tend to react to state- 
ments in the semantic differential rating situa- 
tion with an attitude which is primarily 
evaluative, and that only in the second place, 
as it were, do they concern themselves with 
the degree of ambiguity, clarity, activity, pre- 
cision, and focus of a response. In comple- 
mentary fashion, it is a comforting finding 
that judgments of the second kind are not, 
after all, necessarily tinged with evaluative 
considerations. Presumably in some experi- 
mental rating situations attributes of evalua- 
tion and precision may be confounded and 
thus heighten error, if the subject is forced 
arbitrarily to rate along an inadequately de- 
fined scale. But happily, the two influences in 
the present study appear to be quite inde- 
pendent. 

Finally, it should fairly be said that Os- 
good and his colleagues have clearly docu- 
mented the consistent, primary emergence of 
an evaluative factor in widely differing con- 
texts—even in the ratings of sonar signals. It 
is thus in one respect not at all surprising that 
a similar finding was made in the study re- 
ported here, and that a fusion of his activity 
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and potency factors emerged second. It re- 
mains to be seen whether the 10 therapist re- 
sponses themselves largely hold up to a one-, 
two-, or three-dimensional hypothesis, when 
rated by three independent groups of subjects 
upon each of the three dimensions uncovered. 
Such a study is planned. 


SUMMARY 


Thirty-five Board-certified psychiatrists 
rated 10 bona fide therapist verbal responses 
against 40, seven-point, bipolar adjectival 
scales, chosen to correspond with the hypo- 
thetical variables of Ambiguity, Lead, Infer- 
ence, Stressfulness, Evaluation, Potency, an 
Activity. The matrix of intercorrelations 
among the 40 scales was analyzed by Thur- 
stone’s complete centroid method, and the 
centroid was rotated via the quartima* 
method, maintaining orthogonality. The first 
factor, accounting for 33% of the total vari- 
ance, was one of Professional Evaluation; the 
second, accounting for 18% of the total vari- 
ance, was one of Precision/Potency. The thir 
factor discussed (accounting for only 4%) 
was one of Subjectivity /Objectivity. The re- 
sults thus indicate that while ratings Were 
made primarily on an evaluation dimensio” 
and secondarily on a dimension of precisio” 
and potency (ambiguity), the two dimension’ 
are independent. Results are discussed fro™ 
the standpoints of their limitations and ge?” 
eralizability, the need for further experimen” 
tation, and the findings of other investigators: 
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NEED VALUE AND EXPECTANCY INTERRELATIONS 
AS ASSESSED FROM MOTIVATIONAL PATTERNS 
OF PARENTS AND THEIR CHILDREN’ 
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The nature of the relation between need 
value (goal value) and expectancy has been 
approached in two somewhat different ways. 
In more controlled laboratory studies (Cran- 
dall, Solomon, & Kellaway, 1955; Edwards, 
1955; Feather, 1959; Irwin, 1953; Marks, 
1951; Worell, 1956), the subject’s preference 
for known alternatives or the subject’s willing- 
ness to bet on an outcome with a controlled 
actuarial probability of occurrence has been 
used as a measure from which expectancy, 
need value, and the relation between them 
have been inferred. In more broadly conceived 
research such as that of Atkinson and his col- 
leagues (1958), story themes have been used 
to study strength of a single motive such as 
need for achievement. Results from both of 
these types of studies have demonstrated the 
importance of the two concepts of need value 
and expectancy in dealing with a variety of 
problems ranging from predictions of animal 
behavior in choice situations to analysis of 
thematic test responses in personality research 
(Feather, 1959). 

Among theoretical approaches to the study 
of need value and expectancy there have been 
differences as to the relation between these 
two variables. Rotter (1954) considers need 
value and expectancy to be independent, as 
does Edwards (1955) in his SEU model. In 
contrast, Atkinson (1958) and others (Feather, 
1959) consider them interrelated. Empirical 
evidence indicates that they are statistically 
interrelated under some conditions, but that 
the relation found varies for different prob- 


1 This investigation was supported by a research 
grant, M-1137, from The National Institute of Men- 
tal Health, United States Public Health Service and 
by research funds from Southern Illinois University. 


ability levels (Crandall et al., 1955) and for 
different goal value levels (Worell, 1956): 
Further, any evident relation decreases when 
a “premium” (i.e. a high goal value) § 
placed on accuracy of estimates of occurrence 
(Crandall et al., 1955; Worell, 1956). Also: 
the relation may be less direct in adults tha? 
in children (Crandall et al, 1955; Irwit; 


1953; Marks, 1951). Finally, the nature of 


the relation may be different for achievement 
and nonachievement situations (Atkinson: 
1958; Feather, 1959; Marks, 1951; Worell, 
1956). 

The objective of the present paper is to prej 
sent findings pertinent to three questions Co? 
cerning need value-expectancy relations tha; 
are of theoretical as well as practical impor” 
tance. These questions concern: the indepe? Í 
ence of need value and expectancy, the rela 
tion between need value and expectancy m 


s E A il- 
parents in contrast to that relation in ch! 


e 
dren, and the relation between need valu 


ei ives 
and expectancy for recognition-status motiv 
in contrast to other motives. 


METHODOLOGICAL PROCEDURES 
Bs AND RATIONALE 

Subjects ¢ 
The subjects were three samples of children, age’ 
2-6 to 5-0, enrolled in a cooperative prescho® ose 
cated on a college campus, and the parents of t i 
children. The three samples included, respectively, (20 
16, and 11 children; total Ns were 45 children 1y- 
boys, 25 girls), 45 fathers, and 45 mothers. FOR 
three of the 45 sets of parents were college stu ere 
or faculty families; the remaining 2 families. z 
from the local community. k 


Collection of Data 
free” 


Parent data were collected in a 100-question atio” jl 


response interview structured to elicit inform 
4 
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concerning parental child-rearing motivations. Each 
interview required approximately 2 hours to com- 
plete and was tape recorded. Interviews were con- 
ducted by two trained interviewers, each of whom 
interviewed half the fathers and half the mothers, 
ut never both members of the same family. Child 
ata were obtained by two trained observers who 
(aes narrative records of each child's behavior for 
ee Periods during regular preschool activities. 
ne child was observed approximately 300 minutes 
obs g the first 3 months of school in the fall, with 
*tvations for each child distributed equally over 

a iar period. The children were observed in 
vent wae predetermined manner designed to pre- 
ol ee biases. All parent interviews and child 
Tom t lons were typed and data analyses were made 
Ypescripts, 
Rating Procedures 
Sent study has involved categorizing and 
g parent interview responses and child pre- 
chaviors on the same motivational variables, 
at the same time avoiding contamination in the 
ie pe of parental and child protocols. Accord- 
ne ` ae) research teams operated independently— 
orking with the child data and one with the 
data—to develop explicit operational defini- 
x all concepts, and to construct scoring-by- 
chila ge als for the parent data and for the 
The eae these definitions. ; - 
nitions a of constructing parallel operational defi- 
or children 4 value and expectancy for parents and 
l record n’s behavior was accomplished as follows. 
i S Were analyzed by referents, or behavioral 
ach unit contained information concerning: 
Stimulus context at the moment (interview 
or parents; nursery sétting or specific be- 
Stimuli for children); (b) the goal-direc- 
ction Was C Tesponse (for the parent this goal-direc- 
he meres from content of verbal statements; 

hayig, Wd it was inferred from the nature of his 
E» friendly behaviors are considered to 
affection goal-directed); and (c) other 
S of the response (e.g., for the parent, 


Assess 
i 


e 
Questi 


eni 
ce, statement of anticipation or dread as ac- 


ag a behavior, ete.; for the child, persist- 
ccomm iYe-nonconstructive or defensive na- 
mon o ing verbalization, etc.). a 
ta a rauel categories used for classifying 
th e q vith regard to goal-directional character- 
È tesene aed from Rotter (1954). As utilized in 
ODerati study, they have been given the follow- 
ch cogni a definitions: 
kp d's aod Status (R-S): Parents—concern with 
cyown a ements, teaching skills to child, being 
atten Sood parent, doing what a parent should. 
tage ttributes E attention to achievement, behav- 
a achia Or possessions; conformity to or imi- 
S; ai ii or other socially approved be- 
€ anq ago Setting activities. ; 
x ection (L&A): Parents—concern with 
Panionship with child, interest in child’s 
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happiness. Children—cooperative play, friendly be- 
haviors, sharing, helping, sympathetic behavior, seek- 
ing affection. 

Dominance (Dom): Parenis—concern with con- 
trolling child, teaching obedience, molding child. Chil- 
dren—commanding others, making demands, aggres- 
sion, controlling others’ activities. 

Protection-Dependency (P-D): Parents—(not rele- 
vant to child-rearing motivations of parents with pre- 
school children). Children—secking help, information, 
permission, comfort or consolation, and intervention 
of others to prevent frustration. 

Independence (Ind): Parents—(not relevant, since 
independence satisfactions by definition are self-medi- 
ated), Children—individualized activity, self care. 


Rationale 


The following general rationale was developed as a 
basis for differentiating expectancy and need value 
referents for each motivational category. 

Need value. In general, there is agreement among 
psychologists that there is a direct relationship be- 
tween the value of a goal or reinforcement and choice 
preference for that goal. Accordingly, as a basis for 
inferring need value (NV) ratings on the seven-point 
scale used, the following operations were set up. 

1. Strength of NV rating is determined by: 

a. Variations in stimulus cues as follows: For 
both parents and children the fewer the stimulus cues, 
or the less the “stimulus pull,” the higher the need 
value rating given. So if there are very few stimulus 
cues present—as when an interview question is di- 
rected toward eliciting a response in one need area 
and the response given is directed toward another 
need—the response is given a relatively high NV 
rating. 

b. Variations in response characteristics as fol- 
lows: For children, persistence of response leads to a 
higher rating. For parents, statements concerning per- 
sistence, or statements concerning instigation of the 
indicated kind of activity in situations with few cues 
present lead to a higher rating. 

2. Need category to be scored for NV is indicated 
as follows: 

a. Goal direction * of response, either interview 
statement or observed behavior, is scored. 

b. Goal direction indicated by the, situational 
structure is scored if the situation is “maximized” for 
response in that direction. This criterion leads to 
scoring for nonoccurrence of a response when a maxi- 
mally structured situation is rejected completely by 
the person responding; this rejection is considered 
evidence of very low NV in that area. The response 
direction is also scored as indicated earlier and given 
a high NV rating if it occurred when there were very 
few stimulus cues present to elicit it. 

This procedure of scoring NV in an inverse rela- 
tion to the strength of eliciting stimuli is roughly 
comparable to procedures followed in projective test- 


2 For a more detailed account of determination of 
goal-direction of responses, see Tyler (1960), 
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ing in which attempts have been made systematically 
to reduce clarity of stimulus cues as a basis for get- 
ting at important personality dimensions, and in 
which unusual or atypical responses are considered 
most significant (Lindzey, 1952). Scoring nonoccur- 
rence of responses for low NV under specified condi- 
tions is less generally accepted, but seemed justifiable 
and consistent with the broader rationale on which 
NV strength was determined. The behavioral vari- 
able used, persistence, is traditionally considered an 
indication of the strength of motivation to achieve 
a goal, and consequently of the value of the goal to 
the person behaving. 

Expectancy, There is considerably less agreement 
among psychologists as to the relationship between 
expectancy (Ex) or subjective probability and choice 
preference for a goal. One point of view is that there 
is a direct relation between these two (Atkinson, 
1958). A somewhat conflicting view is that the effect 
of low Ex of goal attainment is to lead to the oc- 
currence of “defensive” behaviors with regard to im- 
portant goals (Eriksen, 1950; Rotter, 1954). Much 
attention has been given in personality research to 
study of defenses, even to the extent at times of 
assuming that nonoccurrence of pertinent behavior 
must be defensiveness (Zuk, 1956). This extreme po- 
sition is not taken here. The position is taken that 
defensive behaviors are not categorically different 
from other behaviors; rather they indicate the low 
end of a continuum of constructiveness of goal di- 
rected activity. That is, the constructiveness of be- 
haviors is considered to reflect directly the subject's 
level of subjective probability. Consequently the fol- 
lowing measures of expectancy were set up. 

1. Need category to be scored for Ex is indicated 
by goal direction of response, 

2. Strength of Ex rating is indicated by: 

a. Variations in response characteristics as fol- 
lows: For children, ratings range from lowest for 
withdrawal behaviors, clearly defensive behaviors 
(eg, disruptive or nonsocially approved behaviors), 
and tentative abortive behavior attempts; up the 
scale to highest ratings for direct, socially approved 
interactions which include indicators of high con- 
fidence (e.g., direct smiling approaches to new chil- 
dren, etc.). For parents, ratings range from lowest 
for statements of dismay, loss of self-control, and 
failure and frustration over attempts to attain goals; 
up the scale to highest ratings for statements of an- 
ticipation of interacting with children, joy at inter- 
actions together, etc. 

b. Variations in stimulus cues also affect Ex as 
follows: For children, since data used were records of 
direct behavior in relatively free situations (i.e. no 
“forced choices” of a controlled sort existed as they 
might in experimental situations), it was thought that 
the levels of expectancy which were quite high might 
not be differentiable except in situations where a 
few stimulus cues were present. Consequently it was 
decided to give the highest ratings only to high con- 
fidence responses which occurred in situations where 
there were a minimum of relevant stimulus cues. 
Although this method of scoring child Ex makes pos- 
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sible a slight overlap between child NV and Ex 
scores, it was deemed essential for the reasons in- 
dicated. For parents it was possible to structure ques- 
tions to elicit expectancy statements independent of 
need value, so no comparable variations in scoring 
were required for parental expectancy measures. 
Correlations reported are Pearson product-moment 
r's (McNemar, 1955) based on mean scores com- 
puted from all the ratings given to an individual. 
Differences between correlations are assessed using 
Fisher’s zs’ transformation (McNemar, 1955). 


FINDINGS 


This study has been concerned with chart- 
ing relationships among motivational charac- 
teristics in parents and in children. For that 
reason findings obtained are meaningful for 
testing hypotheses primarily in a construct 
validity fashion (Cronbach & Meehl, 1955): 
Correlations reported are examined to deter- 
mine whether they are consistent with rela- 
tionships hypothesized to exist from specified 
theoretical points of view. Their support for 
any such aspects of a “nomological net” con- 
stitutes construct validation. 

Interrater reliabilities for the parent and 
child scoring manuals are reported in deta! 
elsewhere (Rafferty, Tyler, & Tyler, 1960; 
Tyler, Tyler, & Rafferty, 1959). Those figures 
can be summarized by noting that median 19° 
terrater reliabilities for these manuals rang® 
from .68 to .84 for samples of 18 and 16 sub- 
jects. 

Correlations reported will be for the total 
45 subjects from all three samples studied: 
The indicated analyses have been run for the 
separate samples, but the variability in fin es 
ings between samples is within the limits €% 
pected by chance. Hence, they are not "° 
ported here. 

The need categories used in this study are 
Recognition-Status (R-S), Love & Affection 
(L&A), Dominance (Dom), Protection-D® 
pendency (P-D), and Independence (Ind): 
For each category, both Need Value i 
and Expectancy (Ex) scores have been ob 
tained. 

It was necessary to assess the independen? 
of the need categories to determine whet 
separate NV-Ex interrelationship analys% 
were justifiable. The pertinent interneed ©?” 
relations among NV measures can be SU? 
marized briefly. For fathers, mothers, #” 


Need Value and Expectancy 


children the R-S and L&A NVs are clearly 
independent, and there is a moderate positive 
relationship (r=.40 +) between R-S and 
om NVs in all three sets of measures. A 
Somewhat smaller but statistically significant 
inverse relationship (7 = — .30 +) exists be- 
tween L&A and Dom NV measures for par- 
ip For girls, this L&A-Dom NV relation- 
also tends to be negative, though for 
Ey it is positive, However, neither of these 
relations approaches statistical significance. 
gua interneed comparisons do yield mar- 
and re o hen (.05 level) 7’s between R-S 
(r= = hi i 34) and between L&A and Ind 
Need co 51). Nevertheless, 7 of the 10 inter- 
ni Meus e ations for all children are nonsig- 
$ ; 8 of the 10 for girls alone are non- 
; and all 10 are nonsignificant for 
n general it seems appropriate to con- 
hat these correlations indicate sufficient 
l independence among these NV meas- 
functi narrant consideration of them as 
Onally distinct motivational categories. 

cy e 'nterneed correlations among expect- 
a €asures are examined more indication 
tio thi istent pattern of positive interrela- 
Need ps is found than was the case among 
Seems «ue measures. However, this pattern 
€ primarily a function of a gen- 
M need el of Ex among the R-S, L&A, and 
Thi 7 as which holds for mothers and girls. 
S Seneralized relationship is moderate (7 
(rs ra for mothers, and moderate to high 
trast nge from 55 to .75) for girls. In con- 


tionship į fathers the only interneed Ex rela- 
ta 


o 
eral; 
pilizea lev 


Can 1S a moderate one (7 = .45, signifi- 
level) between L&A and Dom, 
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while for sons these three Ex measures are not 
significantly interrelated. The nature of the 
sex differential for this generalized Ex pat- 
tern is indicated even more clearly by the 
fact that boy-girl differences are significant 
(two at the .10 level, one L&A-Dom at the 
.05) for these three comparisons, and father- 
girl differences are significant on two of them 
(R-S-L&A 7's at .05 level, L&A-Dom 7’s at 
.01 level). The differences between mothers 
interneed Ex correlations and those of fathers 
and boys also tend to indicate a more gen- 
eralized Ex for mothers, but none of the cor- 
relational differences is significant between 
mothers and sons, and only one (L&A—Dom, 
D significant at .05) is significant between 
mothers and fathers. Even though this some- 
what generalized Ex for females is noted, it 
should also be pointed out that 6 of the 10 
interneed expectancy 7’s for girls are not sig- 
nificantly interrelated; nor are the 3 Ex meas- 
ures for mothers so interrelated as to warrant 
considering them as 3 measures of the same 
variable. For fathers, only 1 of the 3 com- 
parisons yields even a moderate 7, and for 
boys only 1 of 10 (Dom-Ind r = .46, signifi- 
cant at .05 level) achieves significance. Thus 
there seems to be sufficient independence 
among these expectancy variables to justify 
consideration of them as operationally inde- 
pendent. 

Results of analyses of NV—Ex interrelation- 
ships are reported in Table 1. Findings perti- 
nent to answering the first general theoretical 
question of the independence of NV and Ex 
can be summarized as follows: 

1. Fathers’ NV and Ex scores show a slight 


TABLE 1 
Ss = BETWEEN NEED VALUE AND HABE EUINGH MEASURES an 
7 Parents Children 
are = = E 
a Boys Total 
Fathers Mothers Bes otal 
NS deea Category W=4) N= 45) (N = w) we 7 a - oOo 
Recognition-Status — 19 .00 a ad oA Aon 
“ove and Affection 27 a a is o a 
minance .21 07 eat he a ~ 
Totection-De renderer ‘66 ase ‘of 
Sa dependences dency Sa o e 


N 
Ote.—Level of significance: 
ro 
#* y o =.40 .51 -56 
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TABLE 2 


DIFFERENCES BETWEEN NEED VALUE-EXPECTANCY INTERCORRELATIONS 


AMONG SUBJECT Groups 


Need Category Fa-Mo Fa-Child Fa-Boy Fa-Girl Mo-Child Mo-Boy Mo-Girl Boy-Girl 
Recognition-Status 

NV-Expectancy 192 232 042 501 424 .234 .693** 459 
Love and Affection 

NV-Expectancy 159 277 053 408 -436* .212 .567* A55 
Dominance NV-Ex- 

pectancy 144 782** 673* .857** 926**  817** 1.001** -840 
Protection-Dependency 

NV-Expectancy .068 
Independence NV- 

Expectancy 432 


Note.—Fathers (N =45), Mothers (N =45), Boys (N =20), and Girls (N =25). 


* Significant at or beyond the .05 level. 
** Significant at or beyond the .01 level. 


positive interrelation of the order of 20+, 
but they are not sufficiently interrelated for 
that fact to be significant statistically. 

2. For mothers, any NV-Ex relation is con- 
fined to a moderate one (r = 41, significant 
at .01 level) within the L&A need category. 

3. For children, there is a high positive in- 
tercorrelation in three need categories, Dom, 
P-D, and Ind. There is also a high NV-Ex r 
(.60, significant at .01 level) for girls on the 
R-S variable. However, NV and Ex measures 
for children on L&A are completely independ- 
ent. It would seem that the NV-Ex interrela- 
tions obtained are relatively specific, and are 
a function of at least the following factors: 
(a) age of subject, e.g., parents show a closer 
NV-Ex relation on L&A than do children; 
(b) sex of subject, e.g., boys and girls differ 
in degree of NV-Ex relationship on R-S; and 
(c) motivational category under considera- 
tion, e.g., boys and girls have a quite different 
NV-Ex relationship on L&A than on Dom. 

The second question of whether NV and Ex 
are less independent in children than in adults 
can be answered by reference to Tables 1 and 
2. For girls, it can be seen that their need 
value—expectancy scores are significantly more 
closely intercorrelated for R-S motives (girl- 
mother difference significant at .01, girl-father 
at .06) and for Dom motives (both compari- 
sons significant at .01 level) than is the case 


for their parents. However, the converse iS 
true for L&A motives since the NV_Ex r fot 
mothers is significantly greater (.05 level) 
than that for daughters, and the comparable 
r for fathers is greater than that for daugh- 
ters, though this latter difference does 1° 
reach an acceptable level of statistical S18 
nificance. For boys, a similar pattern exists 
though the differences are not as marked. Th® 
greater NV-Ex interrelation on R-S and Dom 
holds for boys in relation to parents, but 15 
statistically significant only for the latte 
Also, the greater NV-Ex interrelation for p3” 
ents on the L&A category holds with resP a 
to boys and parents, but it does not achiev’ 
significance, ad 
These comparisons of parental and chil 
NV-Ex relations do not indicate consistent! 
closer NV-Ex relations for children; rathe® 
they seem to indicate that whether parents © 
children will be found to have a closer NV- f 
relation is a function of the motivational cat? 
gory under consideration, e 
The third question asked is that of the, "es 
lation between NV and Ex for R-S moors 
in contrast to other motives, The hypothe 
for which this comparison is pertinent 15 on 
advanced by Atkinson (1958) which st@ 
that there is an inverse relationship betwigr 
incentive value and subjective probability `g 
need Achievement which does not mai? 
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for other needs. Data to test this hypothesis 
are the relationships presented in Table 1, 
With the pertinent comparisons being those 
between rows for each subject group. For fa- 
wets there are no differences from need to 
eed in the NV-Ex relation. For mothers, the 
a Dom NV-Ex relations are not dif- 
is si eg the comparable L&A relation 
jn icantly greater than comparable com- 
mth ns on the R-S variable (D = 436, sig- 
Cant at .05 level), and the Dom variable 
tay oe significant at .10 level). For chil- 
ate ; p NV-Ex relation on R-S is intermedi- 
Xn relative as well as absolute magnitude. 
level) ged (D = 424, significant at_.06 
and os the comparable relation for L&A 
Varia} Ss than that on the other three need 
= Ples (P-D D = .334, nonsignificant; Ind 
= 58 significant at .05 level; Dom D 
are ee Significant at .01 level). The findings 
Vanced in nee with the hypothesis ad- 
SS y Atkinson since for none of the 
ie pe is there clearly a less direct or 
for Ald relationship between NV and Ex 
Motives than for all other motives. 


Discussion 
_ The 


in fae theoretical questions concern- 
Which value-expectancy interrelationships 
5 are the primary focus in this article 
anqa > (@) the independence of need value 
Of ne “Pectancy, (5) the greater independence 
N chi values and expectancies in adults than 
Ship be ren, and (c) the less direct relation- 
bleg tween need value and expectancy vari- 
the, OT Tecognition-status motives than for 
p tivational categories. It should be 
to th s at Previous data from which answers 
© questions have been inferred have 
Y been derived in controlled experi- 
a] uations where known expectancies 
wahip yl Preferences could be systematically 
au ated. In contrast, the findings of this 
Scor based on more molar data with 
So es for each subject derived from 
€tyie,, < Dumber of referents (answers to 
the jHestions, or observed behaviors). 
tap present re not appropriate to assess from 
Do ed in n ndings the validity of results ob- 
Ssible , Pecific experimental settings, it is 
esent O determine from these results the 
iveness of those more molecular 
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situations and the generality of conclusions 
derived from them. 

The question of a general need value-expec- 
tancy relation which holds for all needs and 
all subjects has been raised by Rotter (1954), 
Atkinson (1958), Edwards (1955), and others 
(Feather, 1959). Neither a general position 
of NV-Ex independence (Edwards, 1955; 
Rotter, 1954) nor a position of interdepend- 
ence (Atkinson, 1958) can be supported by 
the findings of this study. Rather, it would 
seem that need value-expectancy interrela- 
tions are specific to the sex and age of the 
subjects studied, and to the motivational cate- 
gory measured. Although fathers yield a 
stable nonsignificant NV-—Ex relationship for 
all need comparisons, none of the other sub- 
ject groups do. Further, the NV-Ex relation- 
ship for parents is different from that for 
children on both L&A and Dom, and this in- 
terrelation for girls on R-S is significantly dif- 
ferent from that for boys, for fathers, and for 
mothers on the R-S variable. In general, these 
parent-child differences in need value—expec- 
tancy relationship such as that obtained on 
the L&A variable support the conclusion that 
adult motivational patterns are not clearly 
established during the preschool years, Fur- 
ther, the obtained sex differences, such as that 
between boys and girls on R-S, provide sub- 
stantial evidence of sex-linking in patterns of 
motivational development. The specificity of 
relations found leads to underscoring the need 
for caution in generalizing from the results of 
one study to motivational characteristics of 
the population at large. 

The second question which concerns a closer 
relationship hypothesized between need values 
and expectancies in children than in parents 
is derived from a suggested explanation for 
discrepancies in results obtained by Crandall 
et al. (1955) and Irwin (1953) in contrast to 
those of Marks (1951). Crandall has indi- 
cated that the smaller effect of reinforcement 
values on expectancy statements found in his 
work and that of Irwin, in comparison to 
Mark’s findings, may be a function of greater 
susceptibility to reinforcement value effects in 
younger children, since Marks used 9 to 11 
year olds as subjects, and Crandall and Irwin 
used college students. If such a phenomenon 
does exist, then preschool children’s NV and 
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Ex measures should be more closely related 
than is the case for adults (i.e., their parents). 
The results obtained suggest that this ex- 
planation is an oversimplification of the rele- 
vant variables. Specifically, on the Dom vari- 
able the children show a greater NV-Ex in- 
terrelationship than the parents do, on the 
L&A variable the parents show a greater re- 
lationship than the children, and on the R-S 
variable the boys show a similar NV—Ex re- 
lationship to that of their parents, although 
the girls show a closer relationship. 

The third theoretical question concerns the 
hypothesis advanced by Atkinson (1958) 
that there is an inverse relationship between 
incentive value and subjective probability for 
n Ach which does not maintain for other 
needs. As noted earlier there is not complete 
overlap between the R-S motive category and 
the n Ach category, but R-S would seem to 
be subsumed within the latter category. Con- 
sequently, findings concerning it should be 
pertinent. Findings from the present study 
show that the indicated R-S NV-—Ex correla- 
tion is not significantly closer to an inverse 
relationship for R-S than for all other needs 
for any of the four groups (fathers, mothers, 
boys, girls) analyzed. In fact, for each group 
of subjects there is at least one other variable 
on which the NV-Ex relationship is no differ- 
ent from or less direct than the R-S NV-Ex 
relationship. Therefore, although these find- 
ings support an assumption of specificity of 
NV-Ex interrelationships to age and sex of 
subjects, and to motivational category under 
study, they are not consistent with Atkinson’s 
hypothesis. It may be that this discrepancy 
is a function of the lack of comparability be- 
tween the more molar level of measurement in 
the present study and the more specific and 
controlled measurement situations in which 
his work has been conducted. This discrep- 
ancy may also suggest that the inverse rela- 
tion between subjective probability and in- 
centive value which seems to hold in his ex- 
perimental situations may not be an important 
determinant of behavioral choices of the sort 
on which the findings of the present study are 
based. However, the obtained intersubject dif- 
ferences on the R-S variable warrant particu- 
lar consideration, since many investigations 
of motivation have focussed on achievement- 
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type situations and tasks. Although the recog- 
nition-status motive includes only other-medi- 
ated achievement goals, so it is not completely 
equivalent to n Ach, it does fall within the 
achievement category. To the extent that 
these concepts overlap, the present findings 
indicate clearly that for preschool girls im- 
portance of achievement goals and subjective 
probability of attaining those goals are much 
more closely linked than is the case for pre 
school boys or for parents. Since this differ- 
ential does not hold for the mothers’ child- 
rearing motivations, it would seem that girls 
may acquire this distinction at a later date 
than is the case for boys. Nevertheless, these 
findings tend to confirm the results of Veroff | 
(1953), which indicate that there are signifi- 
cant sex differences in response to achieve 
ment situations. 


SUMMARY 


In the present paper relations among the 
need values (NVs) and expectancies (Ex’s 
of child-rearing motivations of parents 2” 
of motivations of their preschool children are 
analyzed. Subjects were 45 families with chil- 
dren enrolled in a cooperative preschool in 4 
university housing project. Parental motives 
for recognition-status (R-S), love and affe” ; 
tion (L&A), and dominance (Dom) were as 
sessed from free-response interviews; Chi 
motivations for the same three motives, a. 
for protection-dependency (P-D), and inde” p 
pendence (Ind) were assessed from narrative 
observations of preschool activities. Adequat? 
interrater reliabilities for both parent a” 
child score-by-example manuals for rating 
indicated protocols are reported. " 

Patterns of intercorrelation among the ™°” 
tivational variables measured support the con” 
clusion that they are operationally i" 5 
pendent. Need value-expectancy correlation” 
obtained are viewed as evidence in a constrU® A 
validation sense of the validity of certain 5 
potheses which have been advanced by ae 
concerning theoretical interrelations betwer Í 
need value and expectancies. On the basi 
these findings, there is no support for the 
pothesis that there is either a general patt 
of independence or a general pattern of in ds 
relation between NVs and Ex’s which meter 
for all subjects and all motivational a P: 
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Bories, Rather, there is support for the posi- 
tion that NV and Ex interrelations are spe- 
cific to the need category under study, and to 
age and sex of subjects. The hypothesis that 
than and Ex’s are less independent in children 
eral] m adults is likewise not supported gen- 
a Y, Since this interrelation also seems to be 
Sten of the motivational category un- 
least udy, with parents differentiating more 
tional. on one of the three common motiva- 
More rt eeories, and children differentiating 
ie T on one. The hypothesis that there 
and E be an inverse relation between NVs 
maint Xs for R-S motives which does not 
Dorte: an for other motives could not be sup- 
relation hough there were clear NV-Ex cor- 
Categor differences from need category to need 
the at for none of the subject groups was 
ec NV-Ex r clearly less direct than was 
ase for the other motives. 
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CROSS-VALIDATION OF A RORSCHACH CHECKLIST 
ASSOCIATED WITH SUICIDAL TENDENCIES 


IRVING B. WEINER? 


University of Rochester School of Medicine and Dentistry 


Although numerous methods for assessing 
suicidal potential by means of the Rorschach 
test have appeared in the literature, failure 
to cross-validate “signs” and “configurations” 
supposedly indicative of suicidal tendencies 
has cast doubt on the utility of these meth- 
ods (Fisher, 1951; Sakheim, 1954). However, 
recent work by Daston and Sakheim (1960) 
using Martin’s checklist, 17 signs empirically 
derived from comparisons between suicidal 
and nonsuicidal psychiatric patients, has 
yielded promising results. Daston and Sak- 
heim compared the Rorschach protocols of 
patients who were nonsuicidal with those of 
patients who had attempted suicide or had 
actually taken their own lives. In their study 
there were virtually no differences between 
completed suicide and suicide attempt groups 
on Martin’s checklist, but both of these groups 
received significantly more of Martin’s signs 
than the nonsuicidal group. Furthermore, 83% 
of the successful suicides and 72% of the at- 
tempted suicides received six or more of Mar- 
tin’s signs, while only 17% of the controls dis- 
played this many signs. 

Since the above results were derived from a 
population of hospitalized male veterans un- 
differentiated with respect to psychiatric diag- 
nosis, question may be raised concerning the 
concurrent validity of Martin’s checklist for 
a population less homogeneous than the vet- 
eran group. This question in turn points to 
the importance of evaluating the influence of 
such variables as age, sex, hospitalization, and 
nature of psychopathology on the checklist 
scores. Additionally, since most Rorschach 
scores are significantly related to the total 
number of responses given (Fiske & Baugh- 


1 Appreciation is expressed to Norman I. Harway 
for suggestions concerning the manuscript. 


man, 1953), it is pertinent to examine the E 
lationship between response total and check 
list score. 


PROCEDURE 


Two samples were used in this study. The fish 
which was intended to provide gencral informatie 
about Martin’s signs, consisted of all adult patie 
in a 6-month period from the psychiatric services 
a general hospital for whom scorable Rorschach pH 
tocols were available. Patients whose primary dig 
nosis was organic rather than functional in natt 
were excluded. The sample contained 28 males om 
43 females, had a median age of 29.5 years (rane 
15-55), and was comprised of 42 hospital an è 
clinic patients. The total group is categorized by # 
sex, and patient status in Table 1. 


TABLE 1 


AGE, SEX, AND PATIENT Srarus or SUBJECTS _ 
= — 


Number Males Number Females 


Age Hospital Clinic Hospital Clinic 
15-21 HA 
X 1 4 6 5 2 

21-30 4 5 11 4 21 
31-40 6 4 7 4 10 
41-55 3 Í 4 2 

a 
Total 14 14 28 15 


i 
The Rorschach records of these 71 patients, bh s 
ranged in number of responses from 7 to 67 wegtt 
median of 20.33, were scored for Martin’s * 
which are the following: No. D <6 or > 205 r i 
<60 or >79; No. CF > 0 to <3; Total Col gn 
<1; Sum C>10 to <3.5; C or CF appeat a No 
VII-X; C or CF with Sum Y + Sum ra 
FV+VF<1; Sum ¥<1; Sum Y+ Sum pHi 
difference between M and Sum C < 1.5; NO- Z 29%) 
> 6; No. Categories < 6 or > 13; VIII-X/R F: an 
No. P<3 or >6; P<3 with F+ % > © 
time first R < 27 seconds. anie ree 
Subsequently the subjects’ hospital and cline i 
ords were utilized to assign the subjects 1o on 
three diagnostic categories: neurosis, which P ysi* 
cases of conversion hysteria, obsessive-C0 


A 
5 
Co 
SS 


= 
a 
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neurosis, anxiety reaction, and neurotic depressive 
Teaction; character disorder, which applied to in- 
stances of personality trait and personality pattern 
disturbance and psychosis, which included schizo- 
P enik; psychotic depressive, and involutional psy- 
Me reactions, The diagnostic criterion was the 
na ea l the patient during that psychiatric 
oi i which the Rorschach had been given, For 
iosa ye eoig these labels were the discharge diag- 
aa w z clinic patients, the recorded consensus of 

hospital group 


contained committee was used. The ho 
as i 13 neurotics, 14 character disorders, and 
Be Otics; for the clinic group, the respective 
Were 13, 10, and 6. 

ie examination of the records revealed that 
suicide the subjects had made documented, serious 
sion to attempts which had precipitated their admis- 
year pr the hospital. A search of case files for the 
Yielded vious to the 6-month period of the study 
received Additional sample of 16 patients who had 
Mission Psychological testing during a hospital ad- 
his Seco equent to an attempt at suicide. Since 
testing ea group had by and large been referred for 
n aoe the same wards by the same personnel 
as he _been evaluated by the same psychologists 
Dobie ee eight suicide attempts, it was felt ap- 
m fo) Combine these two groups for purpose 
en with the 63 nonsuicidal patients origi- 
Udied. The suicidal group was composed of 9 

5 females, had a median age of 30 years 
5), and had given between 7 and 57 Ror- 
“sponses with a median response total of 18. 
ese patients had been diagnosed neurotic, 
It ay Tacter disorders, and eight as psychotic. 
distributi © noted that this rather even nosological 
findings ton of suicidal patients is consistent with the 
between of Pokorny (1960). The observed similarity 
(Daston completed suicide and suicide attempt groups 


Co: 
Nally 


as cha 


Delusion t Sakheim, 1960) appears to justify non- 
fang of actual suicides in this study. The signifi- 
Suicide, , Wicide attempts as a precursor of actual 
(1957, pp guuscussed by Shneidman and Farberow 
son, ang g2 10) and Robins, Gasner, Kayes, Wilkin- 
“sion Uurphy (1959), further supports this de- 
Whites; REsuLts 
and L original 71 subjects received between 4 
22 Si of Martin’s signs, with a median of 
Signs gns, As the distribution of numbers of 
Scores id not approach normality (Rorschach 
[Pike g Seneral are not normally distributed 
ture gp o Baughman, 1953]), and as the na- 
He, the sampling did not provide equal 
the 9.8 Of subjects for various subgroupings, 
techni Were treated with nonparametric 


u : 
ies; The distribution of number of 


> . 
N eae was divided into five approxi- 
Sibles A Parts for comparison with the 
age, sex, patient status, and num- 
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TABLE 2 
ASSOCIATION BETWEEN MARTIN’S SUICDE SIGNS, 
SUICIDE CLASSIFICATION, AND DIAGNOSTIC 
CATEGORY 


Diagnostic Category 


See E _ Character 
Suicide C Neurosis Disorder Psychosis 


Suicide attempt: 


umber above } 6 7 6 
Number below Mdnè 1 2 2 
N 
Num bove Mdnz 5 10 11 
Number below Mdnè 20 9 8 


a Eight or more signs. 
d Fewer than eight signs, 


ber of Rorschach responses by means of 5 X n 
contingency tables. The distributions of age 
and number of responses were each divided 
into four equal parts; for the dimension of 
patient status, hospital (inpatient) and clinic 
(outpatient) groups were contrasted. None of 
the obtained x? values for the association be- 
tween number of Martin’s signs and these four 
variables approached significance. 

In Table 2 are listed the frequencies of 
suicidal and nonsuicidal patients with differ- 
ent diagnoses who received more or less than 
the total group median number of Martin’s 
signs. Analysis of these data by Wilson’s 
(1956) method for nonparametric analysis of 
variance revealed the following: (a) number 
of signs and suicide classification were signifi- 
cantly associated beyond the .005 level of 
confidence; (b) the association between num- 
ber of signs and diagnostic category was sig- 
nificant beyond the .05 level; and (c) there 
was no significant interaction between suicidal 
classification and diagnostic category. 

The median numbers of signs received by 
the two groups were 8.83 for the suicide at- 
tempts and 7.04 for the controls. A median 
test indicated that the discrepancy between 
these values was significant at the .01 level 
of confidence. No clear cutting score separat- 
ing suicidal from nonsuicidal patients emerged 
from the data; the most efficient cutting score, 
an incidence of eight or more signs, correctly 
classified 79% of the suicide attempts and 
60% of the controls. 

‘The 17 signs were examined individually for 
their capacity to differentiate between the sui- 
cidal and nonsuicidal groups. A series of x2 
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tests revealed that two signs (C or CF appear 
first on VIII-X and P < 3 with F + % > 60) 
had been received by significantly (p < .05) 
more of the suicidal than the nonsuicidal 
group; trends (p < .10) in this direction oc- 
curred with two signs (No. CF >0 to <3 
and C or CF with Sum Y + Sum T < 1); one 
sign (VIII-X/R > 29%) significantly differ- 
entiated the two groups in the direction op- 
posite from expectation; and the other signs 
did not significantly discriminate suicidal from 
nonsuicidal patients. 


DISCUSSION 


The data indicate that, within the ranges 
sampled by this study, scores on Martin’s 
checklist may be interpreted similarly for 
men and women patients of different ages, 
whether they are hospitalized or applying for 
outpatient psychiatric care, and regardless of 
how many responses they give to the Ror- 
schach. Number of Martin’s signs is signifi- 
cantly and positively associated both with 
likelihood of having made a suicide attempt 
and with severity of psychopathology, if the 
labels neurosis, character disorder, and psy- 
chosis may be taken as a continuum of in- 
creasing emotional disturbance. The signifi- 
cance of both main effects, in the absence of 
any significant interaction, indicates that al- 
though the potential of the checklist to assess 
severity of pathology may be a topic for fur- 
ther investigation, the capacity of the measure 
to discriminate suicidal from nonsuicidal pa- 
tients operates independently of diagnostic 
category. 

The operation of the individual signs merits 
consideration, although the positive findings 
do not add much to current conceptions of 
what kinds of people attempt suicide. The 
best discriminator between the suicide at- 
tempts and the controls, P < 3 with F + % 
> 60, may be interpreted as a rejection of 
conventional behavior patterns in the pres- 
ence of adequate reality testing. However, 
this sign, though of predictive value, is too 
rare even in suicidal patients to have general 
descriptive value. The other significant dis- 
criminator and the two signs which approach 
significance (see Results) may reflect diffi- 
culty in dealing comfortably with affective 
stimulation and controlling tendencies toward 
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impulsive behavior. But such speculations do 
not provide hypotheses about suicidal indi- 
viduals which have not previously been ad- 
vanced without recourse to projective test 
data, and the value of the checklist would 
seem to lie more in practical application than 
in contribution to theory. Concerning the sev- 
eral signs which contributed little or nothing 
to the differentiation between suicidal and 
nonsuicidal groups, the moral may be iterated 
that empirically derived items require care 
ful and repeated cross-validation before they 
achieve status as efficient and reliable pre 
dictors. 

With regard to future use of Martin’s check- 
list, a final point should be mentioned. Rose? 
(1954) and Meehl and Rosen (1955) have 
questioned the efficiency of psychometric in- 
struments for the prediction of rare events 
such as suicide and argue that the inevitable 
number of false positives in such endeavors 
renders even highly valid discriminators in 
practicable. However, Cureton (1957) has 
discussed a method of allowing for popula- 
tion base rates in the establishment of cut- 
ting scores to predict a dichotomous criterion 
from a continuous predictor, The necessity }§ 
therefore for collection of sufficient normativ® 
data to establish the distribution of the Pte 
dictor, i.e., Martin’s checklist, in order that 
through consideration of suicide rates, efficient 
prediction of suicidal risks may be imple 
mented. The data of the current study wou 
suggest that the checklist has adequate com 
current validity to justify such future 1 
search, . 


SUMMARY 


Rorschach protocols of 71 patients who had 
been tested during a 6-month period wert 
scored for Martin’s suicide signs. The num 
ber of signs earned was found to be relative 
independent of the age, sex, hospital status 
and Rorschach response total of the subject? 
Subsequently, the records of 63 of these pe i 
tients who were nonsuicidal were compare? 
with those of 24 patients who had made m 
cide attempts. The suicidal group receive 
significantly more of Martin’s signs than A 
controls. A significant positive association W 
also found between the number of sig?5 
ceived and severity of psychiatric illness- f 
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absence of any significant interaction between 
Suicidal classification and diagnostic category 
Indicated that the relationship between sui- 
cidal disposition and number of signs was 
Similar in the different diagnostic groups, 
owever, The results did not delineate any 
eficient cutting scores for predictive use. 
Nevertheless, it is felt that the demonstrated 
concurrent validity of Martin’s checklist justi- 
&s further research to develop reliable norms 
bm when considered in the light of base 
„œ mcidence, might facilitate efficient pre- 
‘ction of suicidal tendencies. 
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Research on acquiescence response set 
(Zuckerman, Norton, & Sprague, 1958) and 
suggestibility (Zuckerman & Grosz, 1958) 
led to an interest in the possible relatedness 
of these variables to the trait called “depend- 
ency.” In assessing this trait, peer nomina- 
tions, objective and projective tests had been 
used. However, anyone working in the field 
of personality plunges into chaos when he 
faces the problem of measurement. One finds 
that widely differing techniques claim to meas- 
ure the same hypothetical variable, but in fact 
do not correlate with each other. The answer 
to this lack of construct validity given by 
some assessment theorists (Leary, 1957) is 
that the tests are measuring “different levels 
of personality.” The concept of levels is re- 
lated to the idea of a continuum of conscious- 
ness-unconsciousness. The more direct tests 
(those which ask the testee to describe his 
own feelings and reactions) are presumably 
at the upper levels, while the less direct tests 
(those which are disguised as creative or per- 
ceptual tasks) tap the lower, “unconscious,” 
levels. This distinction does not resolve the 
psychometric dilemma, for one must still 
demonstrate that a test at any level is meas- 
uring something more general than itself, even 
if this is indicated only by a correlation with 
other measures at adjacent levels. Further- 
more, at some point, a test should relate to 
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behavior as assessed by something other tha” 
another test. This may be observations 
psychologists in life situations, measurement 
in controlled miniature situations, or descrip” 
tions of the subject made by persons who hav? 
had a chance to observe him over some P% » 
riod of time. 

The study reported here represents an ate 
tempt to bring some order to measurement ° 
one construct: dependency. An attempt was 
made to: conceptualize the dimensions of the 
construct; select a battery of tests covering 
the range of the direct-indirect continuum 4? 
representing some of the current method 
ologies of test development; score these test? 
within the same conceptual framework; i 
sess their validity in two ways: (a) concur 
rent validity, by comparing all tests again) } 
an external criterion, a rating by peers; Ca 
construct validity, by factor analyzing E 
correlations between all tests and peer 7 
ings. It was assumed that the largest fact 
to emerge from a factor analysis of all va 
ables would represent a broad dependen, 
factor, and that loadings of the tests 0? t if 
factor could be taken as an indication 0 na j 
construct validity. Of course, secondary y 
tors also might have some bearing 0” “pe 
struct validity, In fact, it was hoped tha the | 
factor analysis might help conceptualize nd’ 
dimensions of the broad construct, dep tS 
ency, beyond the initial working hypoth" je 

The peer rating was selected as the se 
criterion to use for concurrent validity on 
cause it was closest to behavioral descr ine 
in a general sense. It should be noted th 
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Measurement of personality from this source 
depends on the effect that the subject’s be- 
havior has on her peers, and does not neces- 
sarily have a relevance to the subject’s own 


mage of herself or her underlying motives 
and defenses, 


METHOD 


ran Sonen of dependency used in this study 
nt es (1945) description of the “com- 
Sme A moving-toward-people” personality. Since 
within thes tests we considered using were scorable 
venient e Murray (1938) need system, it was con- 
delincat o translate Horney into Murray. Horney 
s marked three traits of the compliant personality: 

ietan need for affection and approval from others 
tha rance), a tendency to subordinate himself to 
eGics to inhibit assertiveness and criticality (De- 

Ame ag Abasement), a tendency toward self- 
opposi and guilt (Abasement). Murray needs at the 

Site poles are: Autonomy and Dominance. 


Subject i 
ae aaa subjects were 78 sophomore student 
niversine a together in a dormitory in the Indiana 

With ane Medical Center. They were acquainted 
© tin e another for about 5 months previous to 

froup ne when tests were given. Six of this initial 

t D were later excluded from the study because 

tiing were incomplete or because their Con- 

Maining Score on the EPPS was less than 9. The re- 
Ween Hela of 72 subjects were almost all be- 

Score wa and 20 years of age. Their average ACE 

Slightly S 58.6 (national norms percentile), indicating 

College st Igher than average intelligence, relative to 

Wards udents, The group’s mean scores on the Ed- 

Bite ne Preference Schedule variables were 

coll With Edwards’ (1954) means for 749 fe- 

Were nits students. Critical ratios higher than 2.00 

The Mare for differences on 6 of the 15 scales. 

S ent nurse group scored significantly higher 
ance, ament, Intraception, Nurturance, and Endur- 

Change "a Significantly lower on Dominance and 

urt e differences on Dominance, Abasement, 

Ween, Vance scores replicate differences found be- 

Schoo) 2 Previous student nurse group from this 

A An _ Edwards? college females (Zuckerman, 

Naveg Aditional comparisons were done on the 

ce seqycPendeney scale (1954) and Gough Domi- 
lent n (Gough, McClosky, & Meehl, 1951). The 

01) mate group scored significantly higher ($ 

The of tis the Navran Dependency scale than a 

S Vera © normals described by Navran (1954). 
emi ice Score of the nurse group on the Gough 

© Scale fell between the average scores re- 
iha te roe et al., 1951) for high and low Domi- 
© low Ups, but fell much closer to the mean for 
this o ninance groups. In sum, it would appear 
group of subjects differs from the usual 
ege student group in various personality 
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traits. The possible bearing of these differences on 
our results will be discussed at the end of this paper. 


Peer Ratings (PR) 


The items from four dimensions of Leary’s Inter- 
personal Check List (Leary, 1957) were adapted to 
form three eight-point bipolar scales providing de- 
scription at each point and fitting the hypothesized 
dimensions of the construct. These scales are given 
below: 


I. Pride-Shame 


1. Expects everyone to admire her 

2. Always giving advice, acts important, tries to 
be too successful 

3. Makes a good impression, often admired, re- 
spected by others 

. Well thought of 

. Able to criticize herself 

. Apologetic, easily embarrassed, lacks self-con- 
fidence 

7. Self-punishing, shy, timid 

8. Always ashamed of herself 


Ane 


II. Dominant-Submissive 


1. Dictatorial 

2. Bossy, dominating, manages others 

3. Forceful, good leader; likes responsibility 

4. Able to give orders 

5. Can be obedient 

6. Usually gives in, easily led, modest 

7. Passive and unaggressive, meck, obeys too 
willingly 

8. Spineless 


III. Independent-Dependent 

1. Egotistical, conceited, cold, unfeeling 

2. Boastful, proud, self-satisfied, snobbish, thinks 
only of herself, shrewd, calculating, selfish 

3. Independent, self-confident, self-reliant, can 
be indifferent to others, business-like 

4, Self-respecting, able to take care of herself 

5. Grateful, appreciative 

6. Often helped by others, admires and imitates 
others, very respectful, very anxious to be 
approved of, accepts advice readily, trusting, 
and eager to please 

7. Dependent, likes to be taken care of, easily 
fooled, wants to be led 

8. Clinging vine, will believe anyone 


The subjects were asked to rate every nurse in the 
group whom they felt they knew or had observed 
enough to rate. They rated a peer by circling one of 
the numbers from 1 to 8 on each of the three scales. 
A combination score was obtained by summing each 
subject’s ratings for each peer that she rated. 


Tests 
Six basic kinds of instruments were used. In or- 
der of what was felt to be their directness, we have: 
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1. Self-Ratings (SR): All subjects made self-rat- 
ings on the same scales that they used to make peer 
ratings. 

2. True-False Questionnaires (Q): 

a. Gough (1951) Dominance scale, developed 
from the MMPI using an empirical method, or item 
selection based on criterion groups 

b. Navran (1954) Dependency scale, developed 
from MMPI using content validity, or item selection 
by clinical judges 

In addition to the scores derived from the sepa- 
rate questionnaires, a combination score suggested by 
Ullman (1958) was formed by counting items on 
both tests scored for submissiveness and dependency 
and eliminating overlapping items. Ullman called this 
combined scale “Lack of Self-Assertion.” 

3. Forced-Choice Questionnaire: Edwards Personal 
Preference Schedule (EPPS) (Edwards, 1954). It is 
assumed that the elimination of the social desirability 
factor makes this kind of forced-choice questionnaire 
less direct than the usual true-false questionnaire, 
since the subject is less able to choose his responses 
in conformity to a more obvious stereotype of “ad- 
justment.” Furthermore, the pairing of 15 needs prob- 
ably makes it more difficult for the subject to grasp 
what is being measured than when all items pertain- 
ing to a particular need are presented in one scale. 
The only scales actually used were those relevant 
to our construct: Deference, Succorance, Abasement, 
Autonomy, and Dominance. A combination, or ratio, 
score was formed by converting the raw scores to 
Edwards’ standard scores and taking the ratio of 
Deference plus Succorance plus Abasement to the 
total sum of all five scores. 

4. Sentence Completion Test: Rohde’s (1957) Sen- 
tence Completion Test (SC) was used since the 
manual describes scoring in the Murray need system. 
The SC often is classed as a projective test because 
it is free-response. But it is more direct than other 
projective tests because it asks the subject to de- 
scribe his own feelings. The items were scored for 
the same variables as were scored in the EPPS, and 
the same ratio score was used as a combination score. 
The three experimenters scored the test one item at 
a time over all subjects. An initial discussion of the 
range of responses on the particular item was fol- 
lowed by independent scoring and then acceptance 
of scores with a minimum consensus (two out of 
three agreement on the basic score). 

5. TAT: Ten cards from the TAT (Cards 2, 3GF, 
4, 6, 7BM, 7GF, 9GF, 10, 12M, 18GF) were pre- 
sented to the group using an opaque projector. Sub- 
jects were allowed 5 minutes per card to write their 
stories. The three experimenters scored the stories one 
story at a time over all subjects. Each experimenter 
gave the story an initial score, and disagreements 
were resolved by conference technique. The five needs 
scored were the same as in the EPPS and SC, and a 
ratio score was formed in the same manner as on 
these two previous tests. 

6. Rorschach: The Rorschach was group adminis- 
tered using an opaque projector to project the cards 
on a screen. Subjects were allowed 3 minutes to write 
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down the different things they could see on a card. A 
scoring system for Dependency content was adapted 
from that of DeVos (1952). After some practice scor- 
ing, it was found necessary to eliminate some of his 
more ambiguous categories. Two assistants independ- 
ently scored the content of the 72 records. The cor- 
relation between scorings was .83. The frequency 
scores of the two scorers were averaged for each sub- 
ject, and this average was divided by the subject’s 
total number of responses to get a Dependency per- 
centage score. 


RESULTS AND DISCUSSION 
Peer Rating Reliabilities and Intercorrelations 


The number of peers rating each subject 
ranged from 37 to 74 with a median of 57. 
Reliabilities of the peer ratings were calcu- 
lated using a formula devised by Horst 
(1949). The resultant reliability coefficients 
for PR, PRy, PRi, and the Sum PR, were 
94, .96, .94, and .96. The correlations be 
tween the three PRs were close to the maxi- 
mum limit set by their reliability: .92, 91; 
and 92. Apparently, the subjects made no dis- 
tinctions between the three scales in rating 
their peers. However, the halo effect cannot be 
attributed to a general positive-negative Te- 
action because the “negative,” or “undesit- 
able,” descriptions are at both ends of the 
scales. A general evaluation might affect de- 
gree, but could not affect direction of the rat 
Ings, i.e., the rater still had to decide whetheT 
she did not like the ratee because the rate¢ 
was too dependent or too independent. How- 
ever, this “across-scales” halo effect preclude 
using the PR to reveal possible dimensions ° 
the general construct, Therefore, only the sum 
PR was used in all comparisons with other 
measures. This measure was averaged across 
judges for each subject. The distribution ° 
these average PRs was essentially normal. 


Self-Ratings: Intercorrelations 


The subjects showed more variation he 
tween the three scales in rating themselves, 
for the correlations between SRs on the ie 
scales were .44, 37, and .47. It was therefor? 
possible to analyze these scales in a combin® 
tion score and singly. 


Analysis of Combination Scores 


As a - he 
Table 1 lists the intercorrelations of ee 
sum and ratio scores for each of the b% 
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techniques. The correlations between each of 
the tests and the PR can be seen in the first 
row across. These correlations tend to fall off 
as a function of the postulated indirectness of 
the test. The self-ratings yielded the highest 
Correlation, the questionnaires of both types 
and the SC test were intermediate, and the 
TAT and Rorschach were lowest. Only the 
three more direct tests correlated significantly 
With the criterion, the SC test correlated just 
below the .05 level, and the TAT and Ror- 
Schach correlated close to zero with the cri- 
terion, The combined indices did not mask 
any high relationships between individual 
Scores and the criterion, although in some 
wees one could have done just as well or 
Sightly better using a single variable. For in- 
ee the Gough Dominance scale correlated 
w ms highly (—.33) when used alone than 
ù used in combination with the Navran 
een scale. While the SC ratio did not 
"relate significantly with the PR, the SC 


({tonomy score did correlate significantly 
93), 


It can 


the three also be seen in Table 1 that all of 


Cantly ; Most direct techniques were simii 
signin intercorrelated, The SC test correlated 
i | with the EPPS, and approached 
More ao in its correlations with the other 
Schach rect techniques. The TAT and Ror- 
ach aaa not correlate significantly with 

er p a or any of the other techniques. An- 
Ne Rae seen in this table is the fact that 
diagonal ct Positive correlations are on the 
ten a = the matrix and that correlations 
directio “top off or become negative in either 
fac n from the diagonal. This means that 
test, est tends to vary most directly with the 


of “ai sest to it on the postulated continuum 


š i i . . 
fer su ctness-indirectness.” This finding of- 
these ,PPOrt for the original placement of 


Stems re in this continuum. The SC test 
rect obie, have more in common with the di- 
"espo Jective tests than with the indirect free- 
A Se tests, 
Was her analysis of TAT and Rorschach 
bilig, peer taken in order to check the possi- 
Nirt tent a more global approach to the in- 
rent vaji niques might yield greater concur- 
als o tity, The TAT and Rorschach proto- 
On the ce 10 highest and 10 lowest subjects 


um PR criterion were selected for this 
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TABLE 1 
INTERCORRELATIONS or Sum AND 
Ratio Scores 
Technique SR è Q PPS SC TAT Ror ACE 
PR dhe 28" 24% 22 05.02 
SR 37* 37% 23  —.22 12 
2 AS —.10 — 
EPPS 30% oo — 
SC 11 17 
TAT 
Ror 


* Significant at or below .05 level. 
** Significant at or below .01 level. 


purpose. These protocols were given to two 
experienced clinical psychologists along with 
the rating instructions used by the peers in 
making their ratings. They were asked to pre- 
dict which subjects were rated as dependent 
and which as independent by their peers. 
They were informed that one-half of the 
group of 20 fell into each of these categories, 
and were asked to make their sortings using 
the same distribution. Quantitative scores in- 
volved in the comparisons were dichotomized 
at the medium and comparisons were made 
using exact probability tests for 2 X 2 tables. 

Using the TAT, the judges did not demon- 
strate significant agreement between them- 
selves (60%), or with the PR criterion (40% 
and 70%), or the TAT ratio (50% and 60%). 
The lack of agreement with the latter score 
may be either a function of the unreliability 
of the judgments or a basic lack of compara- 
bility between the global and more atomistic 
methods involved. 

Using the Rorschach, the agreement be- 
tween judges was somewhat better (70%) but 
still short of significance. Agreement between 
both judges and PR was 50% or exactly 
chance probability. Agreement between both 
judges and the dichotomous classification 
based on the Rorschach Dependency percent- 
age score was 80%. The probability of either 
of these judgments occurring by chance was 
less than .02. It would appear from this analy- 
sis that global judgments based on the Ror- 
schach were related to the scores derived from 
the more atomistic scoring method, but were 
no more valid than those scores. 


Factor Analysis of the Combination Scores 


The matrix contained in Table 1 was fac- 
tor analyzed using a principal components 
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program on the IBM 650. One large factor 
accounted for 92% of the variance contained 
in the communality estimates derived from 
multiple correlations on each variable.* The 
significant loadings (significance arbitrarily 
set at .30) on this General Dependency fac- 
tor were as follows: PRs (.52), SRs (.59), Q 
(.61), EPPS (.68), and SC (.46). TAT and 
Rorschach had loadings of negligible magni- 
tude (—.02 and —.23). The ACE had a sig- 
nificant negative loading (—.45). 

The conclusions are essentially those drawn 
from the matrix. The three more direct tests 
demonstrate the greater validity, the SC test 
is intermediate, and the two most indirect 
tests demonstrate no validity. The loading of 
intelligence (ACE) could be interpreted in 
one of two ways: (a) intelligence is an ex- 
trinsic factor influencing the measures through 
response sets which are also extrinsic to the 
trait dependency, (b) intelligence is a factor 
which is intrinsically related to dependency 
and its evaluation in others. Although Hy- 
pothesis a would explain the correlation be- 
tween ACE and the peer ratings (see Table 1), 
it is hard put to explain the correlation be- 
tween ACE and EPPS, a test carefully con- 
trolled for the usual response sets, or the cor- 
relation with SC, a free-response test. It must 
be remembered that a high intelligence level 
gives one a greater potentiality for independ- 
ence particularly in the academic and work 
situations. 


Factor Analysis of the Individual Scores 


All of the single variable scores were inter- 
correlated, and the resulting 23 X 23 matrix 
was factor analyzed using Indiana Univer- 
sity’s IBM 650. Five factors were found to 
account for 97% of the variance contained in 
the estimated communalities.* These factors 


3A copy of this factor matrix has been deposited 
with the American Documentation Institute. Order 
Document No. 6756 from ADI Auxiliary Publications 
Project, Photoduplication Service, Library of Con- 
gress; Washington 25, D. C., remitting in advance 
$1.25 for microfilm or $1.25 for photocopies. Make 
checks payable to: Chief, Photoduplication Service, 
Library of Congress. 

4 Copies of the correlation matrix and the rotated 
factor matrix have been deposited with the Ameri- 
can Documentation Institute. Order Document No. 
6756 from ADI Auxiliary Publications Project, Pho- 
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were rotated to an orthogonal solution using 
the normalized varimax IBM program.° 

Factor I was the most general Independ- 
ence-Dependence factor obtained. Loadings of 
over .30 and in the expected direction were 
found for the Sum PR, all of the SRs, the 
Gough Dominance scale, all of the EPPS 
scales (with the exception of Abasement), 
and Autonomy in the SC test. Loadings in 
the unexpected direction were obtained from 
Abasement in the SC and TAT tests. These 
two loadings are perhaps explainable by the 
fact that the total score of all five variables 
was not partialed out of the scores for single 
variables. These total scores (measures of the 
intensity or scorability of the free-responses) 
were found to correlate negatively with many 
of the dependency measures, and high and 
positively with the Abasement variable in 
both tests. 

As in the previous factor analysis, the more 
direct measures seemed to have the highest 
loadings on the most general dependency 
factor. 

Factor II might be labeled Dominance V5: 
Abasement. It had positive loadings from th? 
Gough Dominance and EPPS Dominance 
scales, and negative loadings from the Navra 
scale, EPPS Abasement, and SC Abasement 
scales. An examination of the Navran scale 
reveals that most of the items do have 4? 
abasement type of content with a lesser nU™” 
ber of succorance items. EPPS Succorancé 
also had a negative loading on this factor 
The factor seems to involve assertion, leadet™ 
ship, and self-confidence at one pole; 2” 
yielding, resignation, inferiority feelings, an 
feelings of depression and fear at the othe™ 
The factor might relate to Cattell’s (1957) 
Dominance vs. Submission factor, but it also 
seems to involve an emotionality element. 

Factor III might be labeled Autonomy 
Deference. It had positive loadings from 


v5 


the 


‘or 


5 i pra 
5 Users of IBM rotation programs may be 2150 
ested to know that the quartimax program we ror 


gram. Both programs yielded substantially : 
results, and differences in loadings were minima” 
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SC Deference and ‘TAT Deference scores, and 
Negative loadings from the SC Autonomy and 
TAT Autonomy scores. Tn addition, the TAT 
Succorance score had a negative loading and 
the ACE loaded positively on the factor. This 
factor is interesting in that it is derived en- 
tirely from two projective, or free-response, 
techniques, and has some obvious construct 
Consistency. 

Factor IV might be labeled Succorance. It 

nad positive loadings from the Navran scale, 
> Succorance, and SC Succorance. A 
Negative loading was obtained from EPPS 
ems A puzzling positive loading was 
ined from TAT Dominance. 

“actor V was practically uninterpretable. 
oa negative loadings from SC Dominance 
Abas utonomy on the one hand and TAT 

“sement and Rorschach Dependency Yo on 
ée eer Ina subsequent analysis, this ne 
A S collapsed and its variance distribute 
aa O8 the first four factors. This operation 


o not change the composition of these first 
ur factors. 


Onstruct validity seems to span three 
t the most. Usually, tests which are 
n their degree of directness share some 
pon variance on comparable variables. 

of the last factor analysis also exposed some 

ough Obvious weaknesses of the tests. Al- 
the ge rm three of the self-ratings loaded on 
Aiscrimin dependency factor they did not 

e Sine among the other dimensions of 
Calleg o truct, Although the Navran scale is 
an aba. pendency” it seems to be more of 
ue aap scale, The Rorschach had only 
a ing of any magnitude, and this was 
Pon, Shite and uninterpretable factor. Pro- 
heart $ Of the Rorschach might take some 
bubi, om an interesting side-finding in an 
Ompa 'shed study by Levitt and Zuckerman 
ang yy 8 the characteristics of Volunteers 
ad ye Volunteers for a hypnosis experiment, 


ests a 
Closer į 


u A 
“Chae “ng this sample of subjects. The Ror- 
Mhic ependency score was the only one 
he yop IStinguished these two groups with 


r But ip teers scoring higher on it (p < 01). 
it vali 8eneral, whether we consider concur- 
*8) o 'dity (correlation with the peer rat- 

construct validity (loading on a gen- 
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eral dependency factor), the more direct tests 
seem to do better than the less direct tests. 
There is a certain irony in these results if one 
considers the time and effort expended on 
them., The self-ratings required only a few 
minutes to administer, and a technician could 
record one subject’s score in less than a 
minute. The TAT took about an hour to ad- 
minister and about an hour each for three 
trained psychologists to score one subject’s 
record. If one starts calculating cost in terms 
of return (validity) a kind of moral can be 
drawn: try out simple techniques before re- 
sorting to complex techniques. This is not the 
first study to suggest that empirically de- 
veloped, objective, and simple psychometric 
instruments may do better, just as well, or 
just as poorly as more complex free response 
techniques. 

There are, however, important reservations 
to forming sweeping conclusions from the re- 
sults of this study: 

1. The concurrent validity of self-ratings 
may be a function of their greater similarity 
in form to the peer ratings. However, one 
would be rather foolish not to make self-rat- 
ings as congruent as possible with the cri- 
terion being predicted. 

2. The peer and self-rating methods are 
lacking in that they do not discriminate di- 
mensions within the construct, and are vul- 
nerable to halo effect. 

3. The results may not have applicability 
for other kinds of constructs. Simple tech- 
niques such as the Taylor scale (1953) and 
an Affect Adjective Check List devised by the 
senior author (Zuckerman, 1960) seem to 
work well in measuring “anxiety,” but one 
wonders what the results would be with a less 
socially acceptable symptom like “hostility.” 
One would expect that the less socially ac- 
ceptable constructs would be more difficult to 
measure with direct techniques, but it does 
not follow from this that indirect techniques 
would do better. 

4, The results may be in some part a func- 
tion of the particular type of subjects used. 
As a group, the Indiana student nurse seems 
to be more abasing and nurturant, and less 
dominant than Edwards’ (1954) college fe- 
males on the EPPS. Perhaps dependency 
(particularly the submissive component) is 
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more socially acceptable in terms of the 
“nurse” ego-ideal. This particular group of 
nurses is also brighter than average, and 
higher on the EPPS Intraception score. Di- 
rect techniques may be less useful with duller 
or less “self-analytic” groups. 

Allport (1953) has suggested that direct 
measures of motivation will be more effective 
than indirect measures with normal subjects, 
while indirect, or projective, techniques may 
be useful in assessing the maladjusted. A con- 
servative conclusion from this study would be 
a simple underlining of Allport’s (1953) state- 
ment: “A psychodiagnostician never should 
employ projective methods in the study of 
motivation without at the same time employ- 
ing direct methods” (p. 111). 


SUMMARY 


The purpose of the study was to test the 
validity of a range of direct and indirect tests 
against a peer rating criterion (concurrent va- 
lidity) and the factors derived from a factor 
analysis of all measures (construct validity). 
Another purpose was to clarify the dimensions 
of the broad construct, “dependency.” 

The subjects were 72 student nurses. They 
rated themselves (self-rating) and each other 
(peer rating) on three bipolar, eight-point 
scales derived from Leary’s Interpersonal Ad- 
jective Check List. All subjects took the 
Gough Dominance and Navran Dependency 
questionnaires, the Edwards Personal Prefer- 
ence Schedule, the Rohde Sentence Comple- 
tion Test, and a group administered TAT and 
Rorschach. The EPPS, SC, and TAT tests 
were scored for five relevant Murray needs: 
Autonomy, Dominance, Succorance, Abase- 
ment, and Deference. The Rorschach content 
was scored using a system adapted from De- 
Vos. Combination scores were obtained by: 
summing the three peer ratings, summing the 
three self-ratings, combining the two true- 
false questionnaires, and using ratios combin- 
ing the five Murray needs on the EPPS, SC, 
and TAT. Combination scores were correlated 
with peer ratings and with each other, and the 
resulting matrix was factor analyzed. Indi- 
vidual scores on all tests were analyzed in a 
second factor analysis. 

Using combination scores, the self-ratings, 

questionnaires, and EPPS scores correlated 
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significantly with the peer ratings, while the 
SC, TAT, and Rorschach did not. The magni- 
tude of the validity correlations tended to 
drop as function of the indirectness of the- 
tests. More global judgments of the TAT 
and Rorschach, using extreme groups on the 
peer rating distribution, did not indicate any 
greater concurrent validity for these tests. A 
factor analysis of the combination scores * 
yielded one large factor called General De- 
pendency. All of the four most direct tests 
showed moderate to high loadings on this 
factor while the less direct tests (TAT and 
Rorschach) showed negligible loadings. A fac- 
tor analysis of the individual scores yielded 
four interpretable factors: (a) General De- 
pendency, (b) Dominance vs. Abasement, (o) 
Autonomy vs. Deference, (d) Succorance. 
Factors a, b, and d were composed mainly 0 
individual scores from the peer and self- t- 
ings, questionnaires, EPPS, and SC tests. Fa 
tor ¢ was composed entirely of scores fro 
the SC and TAT tests, 

In general, the more direct measures of de- 
pendency demonstrated the greater validity 
of both types, but the conclusions about the 
relative validity of the two types of tests are 
limited by the form of the measures, the pa“ 
ticular type of subjects used, and the par 
ticular construct investigated. 
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RESPONSE BIAS IN QUESTIONNAIRE REPORTS* 


N. H. AZRIN, W. HOLZ, anp I. GOLDIAMOND 2 
Anna State Hospital, Illinois 


If one defines psychology as the study of 
behavior, the direct measurement of behavior 
appears to be a minimal prerequisite to fur- 
ther analysis. Several alternatives to a direct 
measurement of behavior are commonly prac- 
ticed. One such indirect method defines and 
measures the behavior in terms of the effect 
of that behavior upon the environment. For 
example, the measurement of reaction time 
typically is based upon the moment of closure 
of an electric switch or push-button. The sim- 
ple fact of closure of the push-button does not 
guarantee that the movement of any one 
finger be involved since any finger or even the 
palm, wrist, arm, or leg might just as easily 
have been used. Such ambiguities in inter- 
pretation are easily overcome, and the experi- 
menter can easily confirm his interpretation 
of the switch closure by occasionally or con- 
tinuously observing the behavior directly. 

A second alternative to direct behavioral 
observation is the interview or questionnaire 
procedure. Here the experimenter typically 
does not have any simple means of direct be- 
havioral observation. Rather, the subject him- 
self is expected to observe his own behavior 
and to describe it at some future date. The 
subject’s reports usually cannot be evaluated 
by direct observation of the behavior being 
reported as was true of closing of the switch. 
The problem is often enhanced by the fact 
that the behavior being reported upon is 
basically unobservable by the experimenter 
by its very nature. This is true for the so- 
called “subjective reactions” as when an indi- 
viual states that he feels hostile or afraid. In 
addition, it is quite likely that a report of 
one’s own behavior will be modified consid- 


1 This investigation was supported by a grant from 
the Psychiatric Training and Research Fund of the 
Illinois Department of Public Welfare. 

2 Now at Arizona State’ College, Tempe, Arizona. 
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erably by the audience or experimenter to 
whom the report is being made. Other factors 
such as social acceptability may also be in- 
volved. The reply to the question “Did you 
cheat?” would probably be different if the 
interviewer were a classmate than if he wer? 
an instructor. For whatever the ultimate re 
sons may be, the individual being questione! 
may have a pre-existing tendency or bias a 
admitting to some statements and not {0 
others. The present study was performed 
order to study the influence of such response 
biases upon the reports of behavior obtainé 
through a questionnaire. 

A well-known study by Shaffer (1947) 
deals with the reports of combat flyers ° 
their fears in combat. The reports had bee? 
obtained from these flyers by means of 
questionnaire some 2 months following the 
termination of their combat experiences. This 
study has been widely interpreted as demori 
strating that some behavioral reactions SU? 
as “soiling one’s pants” are more an 


of fear in combat than are behavioral re4™ 
tions such as “feeling nervous and tense-” E 
order to evaluate the validity of this inte 
pretation, 160 college freshmen and SOP ra 
mores, including males and females, i? A 
separate psychology and sociology © E 
were given a questionnaire. This question" y 
contained the same 15 “symptoms” repok of 
in the original investigation by Shaffer- All c 
the questionnaires had the following dire 

tions: 


Imagine that you are a combat flyer who et fai 
many missions over enemy territory. Your com and 
ing officer gives you the questionnaire below in 
tells you to fill it in. Fill in the answers keeP ei t 
mind what your commanding officer expects u 
have felt. 

adr 


Two forms of the questionnaire were e 
however; half of each class of stude” 


Response Bias in Questionnaire Reports 


ai one form and half received the other 
orm. One form stated after the first sentence: 
You have been 


missions and ha 
below 


extremely frightened on all of your 
re experienced each of the symptoms 
On every flight. 


The other form stated: 


7 
You have n 


sions and h 
below 


ever been frightened on any of your mis- 
ave not experienced each of the symptoms 
on every flight. 


ea of the students are thereby told 
isted ey have never experienced any of the 
told i Poon, whereas the other half are 
toms id have experienced all of the symp- 
eG urther, all students were instructed by 
i eestionnaire to answer in terms of what 
fee regardless of what behavior is 
med to have occurred. 


Resutts AND DISCUSSION 


vhe tle 1 presents the percentage of students 
or «OS each symptom as occurring “often” 
Studeng Cimes.” It will be recalled that the 
Derien, had been told that all symptoms were 
era equally often. The only basis for 
Was 4° SOMe symptoms more than others 
Wha =a Specific instruction to keep in mind 
SWers are expected. Had there been 
Some disposition or response bias toward 
toms ¿> MPtoms, one should expect all symp- 
tainly” Ve been selected equally often. Cer- 
‘Ynpto nO particular rank ordering of the 
et athe S should have emerged. The results 
el demonstrate that the selection of 
Te does not follow a random distribu- 
ie Symptoms were selected as much 
Spear es as often as others. 
hee tank-order correlations were per- 
Ong © determine the consistency of this 
‘as among the students. It was 
oe, the rank-order of symptoms for 
Or gg q clated with that of females with 
whY™Dtome aly, the rank-order correlation 
the Were a Was .95 between those students 
the SYmpee d that they had experienced all 
igh haq ns and those who were told that 
the degree experienced the symptoms. This 
igp Ponse of similarity demonstrates that 
tom ` Tegar dl bias toward certain symptoms 
Was all ess of whether or not the symp- 
eged to occur. 
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TABLE 1 


REPORTED SYMPTOMS OF COMBAT FEAR OF 
160 CotLece Srup 


% of students 
stating “often” 
or ‘‘sometimes” 


During combat missions 
did you feel: 


That your muscles were very tense 72 
A pounding heart and rapid pulse 71 
“Butterflies” in the stomach 67 
Dryness of the throat or mouth 67 
“Nervous perspiration” or “cold 

sweat” 61 
Sense of unreality that this couldn't 

be happening to you 49 
Easily irritated, angry, or “sore” 43 
Need to urinate very frequently 42 
Trembling 39 
Unable to concentrate 36 
Sick to the stomach 34 
Right after a mission, unable to re- 

member details of what happened 32 
Confused or rattled 28 
Weak or faint 25 
That you have wet or soiled your 

pants 11 


In order to determine whether the same re- 
sponse bias might have affected the reports of 
the combat flyers, a Spearman rank-order cor- 
relation was performed between the symp- 
toms reported by the flyers and those re- 
ported by the students. It was found that 
the rank orders of the responses were highly 
similar (p = .89) between the students and 
the flyers. Nor was this relationship reduced 
for those students who were told that they 
had not experienced the symptoms (p = 94) 
as compared with those who were told that 
they had experienced all of the symptoms 
(e = .90). The statistical stability of this re- 
sponse bias is evidenced by the degree to 
which the rank-order in each classroom of 
students was correlated with the rank-order 
reported by the combat flyers: p = .70, .82, 
85, 90, and .92. 

The response pattern obtained from the 
students by means of the questionnaire is al- 
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most completely predictable on the basis of 
response bias. Therefore, it is quite likely that 
the same type of response bias operated on 
the combat flyers. Any conclusions concerning 
the actual symptoms must await study by a 
method that provides for a more direct and 
objective measurement. 

The present findings may well be consid- 
ered for their implications for the use of in- 
terview and questionnaire methods in general. 
Unless an objective and direct means of meas- 
urement is available, the questionnaire re- 
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sponses may be independent of the behavior 
being studied. A definitive method of deter- 
mining the validity of the reports is the direct 
and objective measurement of the behavior 
being reported. Once such a direct measute 
is available, however, the very need for ques 
tionnaire reports is eliminated. 
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THE MMPI AS A MEASURE OF CHARACTER 
STRUCTURE AS REVEALED BY FACTOR 
ANALYSIS* 


f JOSEPH C. FINNEY 
Hawaii Department of Health, Honolulu 


1a ts Psychologist who needs conceptual 
a dile understand and to help people, faces 
mma. On the one hand he can accept 

use dynamic concepts, usually psycho- 
ante S, including urges and defense 
working wi which seem fairly satisfactory in 
with the N patients. If he does so, it is 
cause and Isadvantage that he assumes some 
dence “nd effect relationships on slender evi- 
~ thi ne ne that he treats as quantities many 
the othe n he has no way of measuring. On 
atiam aa if he is hard-boiled, skeptical, 

e iie and parsimonious, he may con- 
Statistical] to relationships experimentally or 
can be a, Y established, and to quantities that 
hims *ccurately measured. If so, he hampers 
leve in seriously: his concepts are at a 
; spine E and simple to be of value 
“Oubles. § and dealing with his patients 


What o 'S, further, a variety of opinion about 
test g Wantities an objective psychological 
l953. ‘Ould Measure (Cattell, 19572; Gough, 
1Xtieg ithaway, 1951). Even factor analytic 
1959." (Cattell, 1957b; Cook & Wherry, 
1956: ia 1950; Lingoes, 1960; Welsh, 

eeler, Little, & Lehner, 1951; Wil- 
the “4E followin 


. g materia av osited with 
ough ican als have been dep 


` Documentation Institute: Tables A 
Hey TS rae eat reference vector loadings, for 
Seg Cal ; oe among groups; lists of items mM 
cal ent ‘correlations of the 59 variables; suc- 
Toots of factors; complete tables of 
DI ne first two centroid factors; D and 
nt tab ortelations among oblique factors; 
loadings Of reference vector and primary 
lique factors. Order Document 
pli ae ADI Auxiliary Publications Project, 
Dan, icy, 2s Service, Library of Congress; 
op tblg ofilm oF - C, remitting in advance 
Cong 2: Chi $7.50 for photocopies. Make che 
Eres, Meh, Photoduplication Service, Library 


liams & Lawrence, 1954) have not agreed on 
what fundamental dimensions are uncovered. 
It seems desirable that tests aimed at vari- 
ables of importance in clinical, including psy- 
choanalytic work, be included in factor ana- 
lytic studies. The recent development of 
programs by which factor analysis, including 
rotation to oblique simple structure, can be 
performed entirely by electronic computer, 
(Dickman, 1959; Pinzka & Saunders, 1954) 
makes it possible for the psychologist with 
minimal mathematical training and little time 
for computation to do such studies. 


AIMS OF THIS STUDY 


The battery of measurements used in this 
study evolved gradually from the clinical de- 
mands of a county mental health clinic. For 
nearly 4 years the author and other psychol- 
ogists made blind interpretations from the 
MMPI scores of applicants for clinic services. 
These personality descriptions were compared 
with the reports of interviews, and attempts 
made to account for discrepancies. One result 
was to increase the psychologists? skill in 
drawing inferences from MMPI scores. Even 
so, there remained many areas, regarded as 
important in clinical interviews, which no 
standard MMPI scale seemed to measure. One 
by one, additional scales were added to fill 
the gaps. 

The strongly felt need for a measure of 
strength of ego-ideal or conscience was met by 
Gough’s (1957) responsibility scale. The clin- 
ical desire for measures of hysterical character 
(defense of repression) and paranoid char- 
acter (defense of projection) was tentatively 
met by adding to the battery Wiener’s (1948) 
subscales of Hy and Pa, which 


were 


“subtle” a j 
sounded promising. The “ego functions 
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tapped by some of Gough’s (1957) scales, 
such as Je (intellectual efficiency) and Do 
(dominance). Finally, some still untouched 
areas, such as stinginess and warmth, were in- 
cluded by adding several unpublished experi- 
mental scales devised for the purpose by the 
author. With the aid of the supplementary 
scales, the psychologist interpreting blindly 
from test scores became able to deal with the 
aspects of personality that interested the clin- 
ical interviewers, and with a satisfactory de- 
gree of agreement. 

The question arises, which of the many 
measures represent important basic aspects 
of personality? Cattell (1952) has contended 
that factor analysis with rotation to oblique 
simple structure eliminates the superficialities 
and reveals the fundamental underlying di- 
mensions, It seemed appropriate to apply that 
method to the group of personality measures 
that were used. 


METHOD 


A questionnaire of 600 items was given routinely 
to all willing literate, nonpsychotic adults applying 
for services (whether as patients or as parents of 
child patients) at a community mental health clinic 
in Illinois, until 50 men’s and 50 women’s records 
were obtained (a process which took about 6 
months). The questionnaire consisted of the booklet 
form of the MMPI, with some unscored and dupli- 
cating items omitted and some additional items in- 
cluded. 

Tests were scored for 56 scales and two individual 
items. Intercorrelations were computed for 59 vari- 
ables (the 58 test variables plus sex), using raw 
scores. The variables used were: 


1. Sex (male or female) 

2. Sex experience (A single item: “T’ve had sex re- 
lations with at least three different people in my 
life.”) 

3. Anti-Catholic prejudice (A single item: “I have 
no religious prejudice, but I think it would be bad 
for the country if a Catholic were elected President.” 
The study was done in 1959.) 

4. L 


6. K 

7. Hs (without K correction) 

8. HsK (Hs with K correction) 

9. D 

10. Hy 

11. Pd (without K correction) 

12. PAK 

13. MÍ 

14. F minus K (Gough, 1950) 

15. PaO (Wiener’s, 1948, “obvious” subscale of Pa) 
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18. PiK 

19. Sc (without K correction) 

20. ScK 

21. Ma (without K correction) 

22. MaK 

23. Si (Drake, 1946) 

24. A (Welsh, 1956) 

25. MA (manifest anxiety, Taylor, 1953) i 

26. Ie (intellectual efficiency, Gough, 1957, MMP, 
items only) 

27. St (capacity for social status, 
MMPI items only) 

28. Es (ego strength, Barron, 1953) PE 

29. Do (dominance, Gough, 1957, MMPI iten 
only) 


16. Pa : 
17. Pt (without K correction) | 


Gough, 195% 


30. Re (responsibility, Gough, 1957, MMPI item 
only) 1 
31. Hı (an experimental scale on hostility, 1 
items) 46 
32. Hz (an experimental scale on hostility, 1 

items) 


33. Sc minus Ma 

34. We (an experimental scale on warmth, 
items) 

35. ODy (an experimental scale on optimi 
pendency, 9 items) 


istic de 


36. Dy (dependency, Navran, 1952) sams) 
37. Ph (an experimental scale for phobia, 12 item 
38. Sd (an experimental scale for sadism, 12 ites 
39. Ob (an experimental scale for obsessio™ 
items) ul 
$ 40. An (an experimental scale for anal or comP ly, 
sive character, containing Scales 41, 43, 44 a 
and a few additional items, 48 items) auc! 
41. Rig (rigidity-flexibility, Gough, 1957, 7 


to 15 items) kel 
42, AuF (the Berkeley F Scale, Adorno, Frey 15 
Brunswick, Levinson, & Sanford, 1950, reduce 
items) sone 9 
_ 43. Or (an experimental scale for orderline 4 
items such as: “I like to have my clothes clean 
times.””) 
_ 44. Sti (an experimental scale for sting 
items such as: “I have to admit I hate to See 
wasted.” si 
; 45. Stb (an experimental scale for stubborn’ 
items such as: “People can push me just 50 ad 


iness! y 
inepe 


40 
id 


then I have to take a stand.”) : ent 
46. Reb (an experimental scale for rebelliow ? 

20 items) P j 
47. Sub (an experimental scale for submis 

26 items) ished) A 
48. Ap (accepted passivity, Harris, unpub me 


49. Id (an experimental scale intended t° cept 
inner direction versus other direction—C” 
Riesman, Glazer, & Denney—21 items) 95? Hi 

50. De (delinquency, Gough & Peterson» mo 
named socialization, Gough, 1957, MMPI a 

51. Fm (feminine masochism, Hecht, 1950 po) 

52. PaS (Wiener’s, 1948, subtle subscale 

53. R (Welsh, 1956) í 
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54. HyS (Wiener’s, 1948, subtle subscale of Hy. 
Our selection followed a somatic-nonsomatic division, 
om differed slightly from Wiener’s in including Items 

and 179 and excluding Item 190. Littles Dz scale 
me & Fisher, 1958) is also virtually identical with 


aot Rep (an experimental scale for repression, con- 
ining the 29 HyS items plus 17 others) 


6, Em (an experimental scale for embarrassment, 
13 items) 
o 57. Fe (femininity, Gough, 1957, MMPI items 
nly) ü 


58. Sc minus Pt 
59. Hy minus PIK 


y qinurstone centroid factor analysis was performed 
siele computer. Rotation to oblique simple 
igital e was also performed entirely by electronic 
man, aa using a standard program (Dick- 
uman 59; Pinzka & Saunders, 1954) so that no 
angles op dgment entered into the selection of the 
Primary Ea anon. Both reference vector loadings and 
nterco actor loadings were obtained. 
ormeq rrelations and factor analyses were per- 
after a ee on the group of 50 men (herein- 
(calleg ee Group M), the group of 50 women 
0), roup W), and the combined group (Group 


with Dinn may be raised about the use of scales 
Many ite mon items. The standard MMPI scales have 
subscales S common to three or more scales, and the 
edtire Pas oe here compound this practice. The 
M the Scale of Wiener, for instance, is included 
Droblep, © Scale. Welsh (1956) has considered this 
items Be enough to eliminate all common 
be ore calculation. Any correlation, however, 
Mon demasa dered as the result of hypothetical com- 
has boon Cts (McNemar, 1949, pp. 117-118), and it 
lem of (pointed out (Wheeler, 1951) that the prob- 
it is qycottelations among scales is the same whether 
Meaning: to physically identical items or to common 
S in different items. 


en RrsuLTS anp DISCUSSION 
troig Factors 


factors the criterion of continuing to extract 

a long as the latent root exceeds 1.0, 
Gro were found in Group M, 12 in 
s a3 and 12 in Group C. The 12 to 13 
Varian €xtracted accounted for 86.4% of the 


82.09 $ în Group M 85.3% in Group W, and 
The pp STOW C. 

Pceq i hurstone centroid factor analysis pro- 

avily > all three groups, a first factor most 

gotor ee on uncorrected Pt (reference 

Ce: an adings .90, .95, .94, dropping to .75, 

ols with K correction), Welsh’s 4 
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(.75, .89, 81), and Taylor’s MA (.85, .89, 
.89); and with substantial negative loadings 
on Gough’s Je (—.76, —.85, —.85) and Bar- 
ron’s Es (—.78, —.85, —.85) (Table A).? 
This factor accounted for 28.7% of the var- 
iance in Group M, 36.6% of the variance in 
Group W, and 33.3% of the variance in 
Group C. 

In all three groups, the second factor was 
most heavily loaded in the positive direction 
with Welsh’s R (.45, .60, .68), Harris’ Ap 
(.40, .47, .53), Gough’s Re (.53, .32, .62), 
Drake’s Si (.53, .33, .38), and experimental 
scales for embarrassment and submissiveness; 
and loaded negatively with Ma (—.66, —.51, 
—.61) and with experimental scales for stub- 
bornness and rebellion (Table B). The first 
two factors together accounted for 40.5% of 
the variance in Group M, 45.6% of the var- 
iance in Group W, and 44.2% of the variance 
in Group C. 

The first two factors obtained by the Thur- 
stone centroid method correspond closely with 
those previously obtained by Welsh (1956) 
using a similar procedure. Welsh’s scales A 
and R (1956) were designed to measure these 
dimensions. 

Thurstone centroid factor analysis defines 
dimensions with the maximum common var- 
iance, i.e., in which the greatest number of 
measures will vary together, or in agreement 
with each other. That procedure seems de- 
signed to maximize the influence of “response 
set.” Examination of the evidence suggests 
that this is indeed the case, and that Welsh’s 
(and the present) first two centroid factors, A 
and R, are very largely composed of response 

t 
j One much studied response set is the tend- 
ency to present oneself as, on the one hand, 
psychologically healthy and normal, or, on the 
other hand, as emotionally upset and in need 
of help. This aspect has been described by the 
Minnesota group as “faking good” or “faking 
bad,” by Gough (1950) as “dissimulation,” 
by Edwards (1957) as “social desirability,” 
and by DeSoto and Kuethe (1959) as “the 
set to claim undesirable symptoms.” This falsi- 
fying tendency, whether conscious or not, is 

2 Tables A through M are included in the materials 
available through the American Documentation In- 
stitute. 
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believed to influence questionnaire test scores 
seriously, the more so in tests with much face 
validity or obviousness. Repeated efforts have 
been made to eliminate this source of variance 
from clinical tests, either at the source (Ed- 
wards) or by suppressor variables (the earliest 
of such efforts being Meehl’s K correction— 
Meehl & Hathaway, 1946). 

Evidence suggests that the first centroid 
factor consists largely of this response tend- 
ency, the willingness or unwillingness to admit 
psychological sickness. The evidence consists 
of the high loadings on this factor of several 
scales already known to measure this tend- 
ency. One such scale is Gough’s (1950) F- 
minus-K dissimulation index, with reference 
vector loadings of .75, .89, and .81 on this 
factor, Another is uncorrected Pf, much of 
whose variance DeSoto and Kuethe (1959) 
have shown to represent a “set to claim unde- 
sirable symptoms,” and which has loadings of 
.90, .95, and .94 on this factor. 

Taylor’s and Welsh’s “anxiety” scales are 
very highly correlated with uncorrected Pt 
(e.g., in Group C, .92 and .93, respectively), 
and may also be described as consisting largely 
of this response set. Both Welsh’s and Taylor’s 
scales consist of items recognizably “sick” in 
their frank verbal content—items that might 
easily be avoided by a person seeking to ap- 
pear “normal.” With these considerations. it 
is easy to understand the failure of Taylor’s 
scale (Kendall, 1954; Lauterbach, 1958) to 
be validated as a measure of anxiety. 

It is noted that K correction, which in- 
creases the validity of five clinical scales 
(Meehl & Hathaway, 1946), invariably de- 
creased their loadings on this factor, If a 
factor measures genuine pathology, the sup- 
pressor variable K should increase factor load- 
ings. 

The second centroid factor showed no con- 
sistent relationship to measures of dissimula- 
tion. This is consistent with Wiggins and 
Rumrill’s (1959) finding of lack of relation- 
ship between dissimulation and Welsh’s R 
scale. 

A second response set that has been widely 
studied is the tendency to answer “true” or 
“false.” (A good review of the literature on 
response sets has been done by Wiggins— 
1959: Wiggins & Rumrill, 1959—and will not 
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be repeated here.) There are some evidences 
to suggest that the second centroid factor 
consists largely of this “response bias.” 

Welsh derived his R scale (1956) to meas 
ure his second centroid factor. The present — 
study replicates Welsh’s in finding substantial 
loadings (.45, .60, .68) of the second centroid 
factor on the R scale. All 40 items of the R 
scale are keyed false. l f 

Couch and Keniston (1960) investigated 
the True or False response tendency in detail, 
and described the personality characteristics 
of “yeasayers” and “naysayers” in terms 
strikingly like those used by clinicians for 10W 
and high R scorers. Couch’s “overall agree- 
ment score,” however, gives evidence of repre- 
senting not only yeasaying but also the other 
type of response set; its correlations wit 
MMPI variables and with 16 PF resemble the 
second order factor that Cattell calls “ans! 
ety” (perhaps better named “willingness t 
admit sickness”) and Welsh’s and the present 
first centroid factor. This is not surprising» 
since in most questionnaires the Yes answe" 
tend to be the sick sounding ones. The pres 
ent study may have succeeded in separating 
these tendencies only in an orthogonal fact 
analysis. 

As a check on yeasaying and naysaying; the 
loadings of the scales on the second centro! 
factor, for Group C, were correlated with the 
percentage of true keyed items in the scale 
Single-item variables as well as Mf (spurio"® 
for mixed-sex groups because of scoring differ” 
ences) were eliminated. The result is only sug 
gestive, since scales contained common ite™* 
but the correlation of —.60 is at least €”; 
sistent with the view that the second centro! 
factor is related to a Naysaying response pins 
(The first centroid correlated .37 with “PO” 
centage true,” no doubt a result of the 4 ae 
mentioned tendency of True answers t° 
sick sounding.) 

To identify the first two centroid fac re 
(and Welsh’s A and R scales) as largely su 
sponse tendency is not to discard the™ pe 
trivial. The clinician values a measure 0 g- 
first factor, willingness or unwillingness t° 4- 
mit psychological sickness, as a sign of 2 ch 
tient’s motivation for treatment. And eit 
and Keniston (1960) have shown perso™ pr- 
correlates (id and impulsiveness versus sul 
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80 and inhibition) for the second response 
set, 


Obliquely Rotated Factors, Cross-Validated 


General, Both of the response set factors 
disappeared on rotation to oblique simple 
Structure, and a quite different set of factors 
“Ppeared. Indeed, some of the scales most 
‘ng loaded with the first centroid factor, 
Pent, Welsh’s and Taylor’s anxiety and 
or n seks, were remarkable for showing little 
9 loading on any of the oblique factors. 

'S result is to be expected. A response set, 
most by definition, influences many scales. 
isin nro method, as pointed out above, is 

sets, 3y to maximize the influence of response 
Site aion to oblique simple structure, 
ching a 1s designed otherwise. The theory 
Scale sh TA procedure assumes that any one 
damental, a be related to only a very few fun- 
Seeks 2 dimensions, and hence this method 
ings .*Ctors each having near-naught load- 
Such ye as many scales as possible. A vector 
stantia] pS. first centroid, which has sub- 
(Table cadings on most of the scales = 
Or this ts avoided. It should be as 
stantia eee to define a factor sub- 
Set, oaded with a widespread response 
think; Attell (1952) has given reasons for 
ture S Phe rotation to oblique simple struc- 
Cover sses the superficialities away and dis- 
Siongsa °C fundamental underlying dimen- 


r 


Th 

@ rotex , 

egy, “tation was performed exclusively by 
tonic 


the Dossi digital computer in order to avoid 

“once, ie that the investigator’s pre- 

Might i Notions of personality dimensions 
Exams Ence the type of factors found. 


"race, p Mation of the 12 to 13 factors ex- 
Bossip rom each group showed that 5 or 
idle factors were common to all three 


p] à hese e c D E 
WG a 5e are shown in Tables C, D, £, 
“oth B H. The factors that were found in 
4 "Ps M and W (and also in the com- 


Sh 
h dte pa be noted that this procedure does not 
ti n, an ign oe ‘tL from the items themselves. 
aan, ihn s, using an oblique factor as 
rig eth nle Produce a seale biased with i 
atly 4 Si Special precautions are taken. che 
w Case with Cattell’s 16 PF, in which 
fon (Yetse dip € correlated both with F and also 
cffet pp ection) with K, evidence of dissim- 
arson & Pool, 1957). 


analy. 


bined Group C) can be regarded as confirmed 
by cross-validation. The factors found in only 
one group may be regarded as having failed in 
cross-validation, though some of them may be 
genuine factors peculiar to one sex. Only a 
further cross-validation on a new sample can 
decide this question. 

Factor 1: Anal compulsive character or re- 
action formation. One factor corresponded 
closely with Freud’s 1908 description of the 
anal character (Freud, 1950). It contained 
substantial loadings on three experimental 
scales designed to measure Freud’s “anal 
triad”: orderliness (.61, .61, .61), stinginess 
(.34, .23, 41), and stubbornness (.40, 21, 
30), respectively (Table C). These three 
scales contained no common items. Higher 
than any of these were the loadings on 
Gough’s (1957) rigidity-flexibility scale (.70, 
.74, .70). A still higher loading (.73, .71, .78) 
was on an experimental scale designed to 
measure anal character, containing the four 
scales just mentioned. 

An abbreviated form of the Berkeley au- 
thoritarian F Scale (AuF) also showed posi- 
tive loadings on this factor (.32, .15, .24). 

This factor seems clearly to correspond to 
the personality type known commonly as the 
anal or compulsive character. In addition to 
Freud’s original paper in 1908, this type has 
been described by Jones (1913) and by 
Fenichel (1945). The concept of compulsive 
character has had widespread clinical accept- 
ance and is recognized as an official diagnosis 
by the American Medical Association and the 
American Psychiatric Association (1952). 

Despite clinical acceptance, there has 
hitherto been scant experimental evidence to 
support the concept of anal or compulsive 
character. In the one outstanding positive 
study, Sears (1942), dealing with ratings of 
fraternity brothers by each other, found 
(when halo effect was partialed out) positive 
correlations of .36-.39 among ratings of 
orderliness, stinginess, and stubbornness, His 
finding was all the more remarkable because 
the young men regarded orderliness as desir- 
able and stinginess and stubbornness as un- 
desirable. . 

The present ñnding of a strong relationship 
between the anal personality factor and 
Gough’s rigidity scale is in accord with the 
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common Clinical impression of the compulsive 
person as “rigid” and suggests that the 
relationship between this factor and the ex- 
perimental findings on perceptual rigidity 
(Luchins, 1942; Rokeach, 1948) should be 
explored. Since the experimental scales Or, 
Sti, and Stb have not been separately vali- 
dated, Gough’s rigidity scale must be regarded 
as the chief defining scale for this factor. It 
is noted that the rigidity scale has higher 
loadings than orderliness, stinginess, and stub- 
bornness, and it is suggested that rigidity is 
the most central characteristic of the anal or 
compulsive factor. 

Anal character was the factor that ac- 
counted for the largest portion of the total 
variance. The remaining factors are pre- 
sented in arbitrary order. Wherever neces- 
sary, a factor from one of the patient groups 
has been reversed in sign, in all its loadings, 
to make it comparable with a factor derived 
from another patient group. 

Factor 2: Hysterical character or repression. 
Another factor common to the three samples 
was one whose highest loading (among previ- 
ously published scales) was on Wiener’s subtle 
Hy scale or HyS (.58, .49, .58). It also had 
smaller positive loadings on Hy (.39, .16, .28) 
and K (.15, .13, .26), and negative loadings 
on F (—.18, —.33, —.22) (Table D). The 
highest loading (.69, .78, .70) was on Rep, an 
experimental scale designed to measure repres- 
sion or hysterical character, and containing 
all of the HyS items with others. 

This factor also seems to correspond to a 
character type long accepted in psychoanal- 
ysis (Fenichel, 1945). Rosenzweig (1945) 
described an “impunitive” reaction character- 
istic of hysterical persons, and a projective 
test for measuring it. McKinley and Hatha- 
way (1944) found that persons with diag- 
nosed conversion hysteria (physical symptoms 
such as headache and vomiting) answered 
certain questionnaire items in a naive Polly- 
anna-like manner. Wiener (1948) separated 
these items (HyS) from the full Hy scale. 
Eriksen, in a series of studies, related psychi- 
atric diagnoses of hysteria, an MMPI measure 
(Hy minus Pt with K correction) and a per- 
ceptual measure of repression (Eriksen, 1952, 
1954; Lazarus, Eriksen, & Fonda, 1951). 

While hysterical conversion reaction is an 
official medical diagnosis (American Psychi- 
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atric Association, 1952), hysterical personality 
has never attained that status, probably be- 
cause it is so common and is not considered 
sick enough. 

This was the only oblique factor in which ; 
a dissimulation measure, F-minus-K, had 4 
substantial loading (—.21, —.28, —.30), and 
it was in the paradoxical direction, such that 
faking good would appear to make one’s sco! .. 
sicker. Other measures of dissimulation (4) 
Taylor MA, Pt) were unrelated to the facto" 
Consideration of the theory of repression sug” 
gests that it is genuine repression that 4 
producing the appearance of dissimulatio™: 
rather than dissimulation contaminating t 
definition of the repression factor. f 

R. B. Cattell, who has kindly examined th? 
present study, suggests (personal communica 
tion, 1960) that the hysterical character or 
repression factor found here may be the same 
as Cattell’s Factor I, called “premsia.” Factor | 
II of Lingoes (1960) also resembles th 
present factor of repression or hysteriC* 
personality. 

Each of the factors discussed so far, CO 
pulsive character and_ hysterical characte", 
has a heavy loading on a previously publishé 
scale (Gough’s rigidity and Wiener’s ‘ 
In each case, there is an experimental ie 
(An and Rep, respectively) which inclu, 
the previously published scale along Ta e 
other items. The An and Rep scales TA 
attempts of the investigator to improve e] 
measurement of the compulsive and hyste" ys 
personalities, and their higher factor 104! Ei 
are evidence of a moderate degree of suce 
in this endeavor. of 

One reason why Rep is a better meas gt 
hysterical personality than HyS may be 
it contains not only the H yS items of ren {0 
sion and denial, but also items desig sot 
tap histrionic dramatization, Another eet 
may be its greater freedom from respons? gls? 
The HyS items are almost all Keyes, (je? 
and, in the keyed direction, are “ZT 4 af” j 
popular) responses that deny sickness an gh 
pear socially desirable (Wiener, 1948): got 
Rep scale is somewhat better balane ir in” 
both kinds of response set as well as ut 
frequency, and this may be why it F * (ot p 
measure of the hysterical character f^i", 

Factor 3: Paranoid character or $” righ ji 
A third common factor was one whos? 
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loading was on Wiener’s subtle Pa or PaS 
scale (.85, .71, .76) (Table E). The whole 
ao scale showed lower (though still substan- 
tial) loadings (.52, .34, 43) while Wiener’s 
obvious paranoid subscale was unrelated to 
oe factor (.06, —.10, 05). The designation 
Paranoid character” was applied to this di- 
mension, ‘This factor resembles Factor IV of 
Lingoes (1960). Paranoid character, without 
Psychosis, is an accepted clinical concept and 
ìS an official medical diagnosis (American 
Psychiatric Association, 1952). 

Factor 4: Conversion. A fourth factor that 
Was extracted separately from Groups M, W, 
i C, had heavy loadings on Hs (with and 
i Out K correction) (.77, .61, .59; .80, .69, 
i, Corrected) and on Hy (.44, .72, .50), and 
Mich lower loadings on D (.22, .20, .10) 
a ee F). It was clearly conversion reaction, 
isa accepted clinical entity, and an official 
fee diagnosis (American Psychiatric As- 

lation, 1952), 

Tong first three character types found as 
Dar: a, the compulsive, the hysterical, and the 
Ne have been defined by psychoana- 
on Abraham, 1953) in terms of bodily 
(Pre, Of libidinal fixation. Others, however, 
t a 1946; Freud, 1936) have defined 
efen Characters in terms of the predominant 
ters S mechanism of the ego. These charac- 
of tea eked, respectively, by the defenses 
tion ction formation, repression, and projec- 
Validat ak the four factors most clearly cross- 
equi, “i in the present study seem to be 
tion EPt to four defenses: reaction forma- 
ae; Pression, projection, and conversion. 
or Se Oral aggression and delinquency. 


Af 

MA cross-validated factor was marked by 
65 i adings on Pd (.63, .53, .58; or .67, .62, 
= Correction) and on the Gough- 


Y: 4 S 
iea ii delinquency or socialization scale 


iden » 52) (Table G). It was tentatively ` 


enti ec as oral aggression. It is less clearly 
than other factors in terms of psy- 
deseri mics and motivation. Gough (1957) 
tebea igh scorers on De as “demanding,” 
(log7,0U,” and “exhibitionistic.” Cattell 
used the same three adjectives in 
8 one pole of his Factor D, a factor 
th degue ntted from the 16 PF for lack of 
Bt Gou ate Measuring scale. It seems likely 
his delinquency-socialization is iden- 
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tical with Cattell’s Factor D and with the 
factor found in the present study. 

An experimental scale on rebellion, how- 
ever, did not correlate in the expected manner 
with this factor (.06, —.21, —.03) (Table G). 
In Group M, the delinquent pole of the 
factor is correlated .22 with stubbornness, 
while among the women it is correlated .23 
with submissiveness. Only further study on a 
new group can tell whether this is a genuine 
sex difference, or an accidental error of 
sampling. It fits in, however, with the general 
impression that women’s delinquencies tend 
to be more passive and less aggressive than 
men’s. 

The tentative suggestion is offered that 
demandingness rather than rebellion may be 
the central feature of this factor. If so, it can 
be equated with the psychoanalytic concept 
of “oral aggression.” This factor is not inter- 


“preted as socialization, conscience, or ego- 


ideal because of its insignificant loadings on 
Gough’s responsibility scale (—.02, —.23, 
—.04). 

Factor 6. A sixth factor, one of obsessive 
worrying, may be common to the three groups 
analyzed, Each group showed a factor posi- 
tively loaded with K corrected Pt (.34, .40, 
.52), and less with uncorrected P (.15, .29, 
.37). The higher loadings on corrected than 
uncorrected Pf suggest (for reasons discussed 
above) that this factor is a personality vari- 
able rather than a response set. Groups M, 
W, and C also agreed in loading Sc-minus-P¢ 
(—.24, —.34, —.39) and Hy-minus-PiK 
(—.67, —.50, —.16) on the same factor. The 
factor patterns from Groups M and W cor- 
respond well enough with the accepted ( Amer- 
ican Psychiatric Association, 1952) clinical 
concept of obsessive-compulsive pschyoneuro- 
sis. But the factor from Group C bearing 
nearest resemblance has more the flavor of 
general neuroticism, with additional loadings 
on Hy (.50), Hs (33), HsK (.38) and D 


(.36). 
Oblique Factors Less Clearly Established 


There were several factors common to two 
of the three factor analyses. Group M and 
Group C showed a psychotic factor (Table I) 
with highest loadings on Sc-minus-Pé. It 
seems plausible that the absence of this factor 
in Group W was an accident of sampling. 
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A factor of sadism, cruelty, narrowminded- 
ness, and prejudice appeared in Groups W 
and C (Table J). It was marked by loadings 
on the Berkeley authoritarian F Scale (ab- 
breviated form), anti-Catholic prejudice, and 
an experimental scale designed to measure 
sadism. 

Groups M and C shared a factor marked 
by high loadings on Ma (Table K). It was 
not significantly related to Welsh’s R. 

All three groups had factors related to 
masculinity-femininity (Table L), though 
there was no single variable with high load- 
ings on the factors from all three groups. The 
variable Mf, which seems to do so, is arti- 
factual in that it is scored differently for men 
and for women. Gough’s Fe scale had rather 
high loadings on the masculinity factors from 
Groups M and C and lower loadings on the 
factor derived from Group W. In Group M, 


the masculine pole included inner-direction,; 


and the feminine pole, other-direction, per- 
haps reflecting a difference between farmers 
and salesmen, respectively, in this Illinois 
sample. In any event, there is no theoretical 
reason for expecting “femininity” to mean the 
same in men as in women. 

Both Group M and Group W showed 
factors whose highest loading was on the 
single item “I have had sex relations with at 
least three different people in my life.” There 
is no similarity in other respects between the 
factor obtained from the men and that from 
the women (Table M). In women, high de- 
gree of sex experience was associated with 
measures of warmth, of delinquency, and 
of accepted passivity.* 

Three other factors appeared in Group M 
alone. One was positively loaded with opti- 
mistic dependency, warmth, and phobia, and 


4 Gough (personal communication, 1960) found, in 
100 military officers, that those high on the Strong 
vocational interest keys for “banker” and for “mor- 
tician” reported early ages of first sexual intercourse, 
and were described as coarse, noisy, and masculine. 
Tt is to be predicted that this group should score low 
on the factor of repression or hysterical character, 
and at the “harria” pole of Cattell’s “premsia versus 
harria” factor. While the evidence of the present 
study is ambiguous, it is consistent with clinical im- 
pressions that hysterical char r has a complex ef- 
fect on sexual behavior, delaying age of first sex 
experience, at least in males, yet facilitating later 
unconsciously motivated promiscuity, “hysterical act- 
ing out” (Table D, in women). 
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negatively loaded with L, Si, responsibility, 
and feminine masochism. Another consisted of 
(affirmatively) stubbornness, rebellion, domi- 
nance, and inner-direction, and (negatively) 
submission and L. The last comprised only 
delinquency affirmatively, and femininity, R, 
and stinginess in the negative direction. 

Group W also showed three additional 
factors. One was loaded affirmatively with 
accepted passivity, sex experience, obsession, 
submission, and feminine masochism; an 
negatively with Sc-minus-Pż, sadism, rebel- 
lion, ScK, and MaK. It may be compare 
with Cattell’s Factor E (1957a). Another 
comprised positive loadings on warmth, P49, 
anti-Catholic prejudice, capacity for status: 
and phobia; and negative ones on L, inner 
direction, R, HyS, and responsibility. The 
other had positive loadings on responsibility; 
sex experience, L, Pa, intellectual efficiency» 
and stinginess; and negative on phobia, femi- 
ninity, and anti-Catholic prejudice. 

Two other factors appeared in the combined 
group. One was a general hostility facto" 
loaded with H», PaS, H,, stubbornness, P% 
and dominance. The other consisted of self- 
confidence and ambition, with loadings in one 
direction on capacity for status, intellectu 
efficiency, dominance, accepted passivity, ibe 
sponsibility, and warmth, and in the othe 
direction for R, social introversion, 4" 
embarrassment. 


Nature of the Oblique Factors 


Some of the factors have been seen tO er 
respond with diagnostic categories (gor 
pulsive personality, paranoid personality), 
But a basic conceptual difference must te 
pointed out. Diagnostic categories are discre. 
entities: a person either is or is not W' iv 
the category, and the usual practice is tO g J. 
a patient not more than one psychiatric $ d; 

Factorial dimensions, on the other nar 
are continuous. The question is not whethe t 
person is or is not a paranoid characte" “ye 
rather, how much paranoid character deer wo 
have? Further, a person may be high, mot! 
or more dimensions. A person high 1? wit” 
reaction formation and projection mays "ve 
equal justification, be called a comp" and 
personality or a paranoid one. Stagner nig 
Moffitt (1956) have shown that persons or 
on a given personality trait do not show 
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than chance resemblance in other traits. 
hese Considerations make it doubtful that 
Psychiatric diagnoses are comparable to 
medical ones. Choice of diagnostic label seems 
Rot to be a fruitful point of dispute. 
In almost every instance, K corrected scales 
i to be factorially purer than the same 
-eS Without K. This finding reinforces 
‘Sint (1956) recommendation that K cor- 
there scores. be used in factor analysis. The 
a ased factorial purity of K corrected scales 
Ppeared only after rotation to simple struc- 
» and was not found in the centroid 
S before rotation, Since K corrected 
1946) are more valid (Meehl & Hathaway, 
oblique tis may be evidence that rotation to 
ore 7 Simple structure produces factors 
Senuine than the unrotated ones. 
clinica] extraction of factors that have a clear 
Îying Sa Psychodynamic meaning is a satis- 
Tae It Provides some rapprochement 
Proache Psychoanalytic and experimental ap- 
that se S. Even more satisfying is the fact 
fenses =o of the factors correspond to de- 
Concepts the ego. For of all psychoanalytic 
jective defenses are most amenable to ob- 
fixation Study—far more so than libidinal 
» for example. Defenses have effects in 
can p Avior and in perception, effects that 
Servaby Studied through experiments on ob- 
been, ee havior (and to some extent have 
Tats 1a? Eriksen, 1952, 1954; Mowrer—in 


Questi 0): Measurements of defenses by 
tu pres scales derived from factorial 
Studie could be useful in such behavioral 
Ntement e clinical usefulness of such meas- 
also appears promising. 
h SUMMARY 
done p UtStone centroid factor analysis was 


So Variables, including the original 
ie men, ewer MMPI scales, on a group of 
gro, © OUP of 50 women, and the com- 
pad in Up of 100, Twelve or 13 factors were 
“ach group. The first 2 factors in all 
unq Sroups resembled those previously 

elsh and named A and R. These 
Were related to response sets, re- 
ing. Willingness to admit sickness and 


rotat; 

Ah ating ation to oblique simple structure, 
Out six “lfferent set of factors appeared. 
these were replicated, being 
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virtually identical in the three samples. Four 
of these corresponded to ego defenses of re- 
action formation, repression, projection, and 
conversion. Alternatively, the first three could 
be named anal or compulsive character, hys- 
terical character, and paranoid character, The 
highest loadings on these factors, among 
previously published scales, were, respectively, 
Gough’s _ rigidity-flexibility, Wiener’s Hy- 
subtle, and Wiener’s Pa-subtle. New experi- 
mental scales had higher loadings on some 
factors. 

The anal character factor also confirmed 
Freud’s concept, having substantial loadings 
on three experimental scales for orderliness, 
stinginess, and stubbornness. 

A fifth common factor, loaded with Pd and 
with Gough’s delinquency-socialization, was 
tentatively identified with oral aggression, A 
sixth factor was marked by P/, and named 
obsessive worrying, but in the combined 
sample bore more resemblance to a factor of 
general neuroticism. 
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E therapists agree that reasonable and 
with ed limits are necessary in play therapy 
iffere 'sturbed children, However, there are 
tequired of opinion about the specific limits 
orfm, m therapy. Some therapists (see 
activition 1951, p. 262) only set limits on 
remain a, that interfere with their ability to 
a child ee of the child. They may allow 
urinate © take toys home, break equipment, 
Will, or on the floor, leave the playroom at 
(Ginott terminate treatment. Other therapists 
They o ge set limits on such activities. 
Selves a allow children to express them- 
nonya Mbolically through words, play, and 
he gestures. 
ther, Sing from the literature, psychoanalytic 
Problem S are less preoccupied with the 
Dists, p Of limits than are nondirective thera- 
foung Xtensive discussion of limits can be 
(Axline mM many nondirective publications 
Moust tii Bixler, 1949; Dorfman, 1951; 
this su aS, 1953) ; only passing references to 
(Schiff Ject appear in psychoanalytic writings 
She i 1952; Slavson, 1952). On this basis, 
Mists ee assume that psychoanalytic thera- 
therapist fewer limits than nondirective 
e 
e ie e know of no published study of 
eos, tmits by therapists of different 
On Was t © purpose of the present investiga- 
discover whether limit setting is 
eoretical orientation. The experi- 
apn theses were: (a) Therapists of 
hits ppools will not differ in the number 
Thera at they employ in play therapy. 
ow Pists of different schools will not 
De, Our a New York Univers; 
totp, than: rk University. 
Cheat?” Coley are expressed to Georgia Dreger and 
"Bel ne the he Or their assistance in compiling and 
Or helpe Tay of data, and to Arthur 
ul advice, 


i PLAY THERAPY LIMITS AND THEORETICAL 
ORIENTATION 


HAIM G. GINOTT: axb DELL LEBO? 
Child Guidance Clinic, Jacksonville, Florida 


differ in the kind of limits that they employ 
in play therapy. 


METHOD 


A questionnaire containing 54 discrete limits 3 was 
sent to 425 child guidance clinics and other agencies 
that treat children. The respondents were asked to 
identify themselves as being primarily nondirective, 
psychoanalytical, or “other,” and to indicate (Yes, 
No, or Sometimes) whether they used a particular 
limit with children who were neither psychotic nor 
organic and between the ages of 3-10 years. 

Questionnaires were returned by 227 play thera- 
pists; of these 100 considered themselves to be 
psychoanalytic, 41 nondirective, and 86 of “other” 
schools. The number of Yes, No, and Sometimes 
responses to each of the 54 limits was totaled for 
each therapeutic school. 


RESULTS 


The mean number of limits used “ordi- 
narily” and “sometimes” by therapists of the 
three schools are reported in Table 1. These 
means do not differ significantly; hence the 
results support the hypothesis that therapists 
of different orientations do not differ in the 
number of limits that they employ in play 
therapy. 


3 Copies of the questionnaire may be obtained from 


either of the writers. 
4Most of the respondents were psychologists, a 


few were social workers and psychiatrists, 


TABLE 1 


MEAN NuMBER OF LIMITS “ORDINARILY” AND “SomE- 


gms” Usen BY PLAY THERAPISTS OF THREE 


ScHOOLS 
Limit Psychoanalytic Nondirective “Other” 
Ordinarily 25 26 27 
Sometimes 12 9 11 
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TABLE 2 
SIGNIFICANTLY DIFFERENT PER E Limits Usep BY PLAY 
THERAPISTS 
School 
Limit Use Psychoanalytic Nondirective “Other” x 
No 35 54 45 
Enter playroom Yes 26 15 28 11.064 
Some 39 31 24 
No 43 54 30 5 
Pour water Yes 24 24 33 9,965 
Some 18 12 16 
No 58 78 57 
Paint cheap toys Yes 14 15 23 19.593 
Some 28 7 19 
No 20 27 15 
Paint costly toys Yes 53 61 54 9,821 
Some 23 10 23 
No 56 51 44 ; 
Bring drinks, food Yes 4 20 23 18.448 
Some 40 24 31 
No 30 42 26 
Light matches Yes 30 34 36 10.162 
Some 28 22 38 
No 54 71 48 
Read books Yes 6 10 17 18.321 
Some 40 19 35 
No 35 54 22 
Do school work Yes 25 24 33 23.668 
Some 40 22 44 
No 9 19 6 g 
Break costly toys Yes 69 71 74 10.965 
Some 21 10 16 
No 30 39 24 4 
Hit therapist mildly Yes 42 42 54 9.799 
Some 28 19 22 
No 21 37 35 763 
Tie therapist Yes 44 54 44 24.76 
Some 34 7 19 
No 16 31 15 i 
Shoot therapist Yes 69 62 67 12.004 
Some 15 1 16 
No 6 2 6 
Fondle therapist Yes 62 83 65 13.401 
Some 29 12 27 
No 3 10 4 
Urinate or defecate Yes 75 71 86 14.287 
Some 22 19 7 
Nore- iiie a small number of items were not answered some columns do not total 100%. 
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tee test the second hypothesis regarding the 
ids of limits employed by different thera- 
i schools, all responses to each of the 54 
ad Were summed according to therapeutic 
ham These were converted into per- 
a and distributed into 3x3 chi 
cance s. Of the 54 items, 14 achieved signifi- 
dence py Better than the .05 level of confi- 
than 3 (On the basis of chance alone no more 
i ee em might have achieved such sig- 
are o The statistically significant items 
Suppo wn in Table 2. These results do not 

ot the second hypothesis that therapists 
in Poe orientations use the very same 

mits in play therapy. 


Discussion 


Withj . 
is tae the confines of the present study it 


EMplo that therapists of the three approaches 
Work Y a similar number of limits in their 
With children, While there are differ- 


ce i 
aPproas the kinds of limits used by the three 
tiong Des, a considerable body of prohibi- 


“4 are employed by all. 
*ainst i the area of physical aggression 
Schools R e therapist, practitioners of all 
Ng ac me ncur to the same degree in prohibit- 
i y Dal ` from squirting water on the thera- 
forcefy nting his clothing, throwing sand, oF 
ver p Y attacking him. They differ, how- 
q ite, nondirectivists were significantly 
3 issive in allowing a child to shoot 
sg to hit the therapist and signifi- 

S permissive in allowing a child to 

I erapist. 
aut BA of physical aggression against 
gp De sama practitioners of all schools concur 
billing € degree in prohibiting a child from 
Sand, painting walls and furniture, 


s ay no 
the lowin Were significantly more permissive 
Sang 4S è Child to pour much water into 
Siva , ther» em and break expensive toys. 
alloy; aPists were the least permis- 

Wing the painting of inexpensive 


es 
item iby . 
7 “Don ih, Whose statistical significance de 


© Sometimes entries were omitted. 
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In the area of socially unacceptable be- 
havior, practitioners of all schools similarly 
prohibited a child from smoking, using racial 
slurs, speaking or writing profanities in the 
playroom, making obscene objects, painting 
his face or clothing, undressing, and mastur- 
bating. They differed, however, in that the 
“other” therapists were significantly less per- 
missive in allowing a child to urinate or 
defecate on the floor. 

In the area of safety and health, practi- 
tioners of all schools similarly prohibited a 
child exploding a whole roll of caps, climbing 
on high window sills, drinking dirty water, 
or eating mud, chalk, or fingerpaints. They 
differed, however, in that the nondirectivists 
were significantly more permissive in allowing 
a child to light matches in the playroom. 

In the area of playroom routines, practi- 
tioners of all schools similarly prohibited a 
child from taking home toys or clay objects, 
turning off the lights, leaving or overstaying, 
bringing in a friend, and talking to passers- 
by. They differed, however, in that the non- 
directivists were significantly more permissive 
in allowing a child to decide whether or not 
to: enter the playroom, to read books, and to 
do his school work there. The psychoanalytic 
therapists were significantly more permissive 
than members of the other two approaches in 
allowing a child to bring drinks and food into 
the playroom. 

In the area of physical affection, practi- 
tioners of all schools similarly prohibited a 
child from sitting on their laps, hugging, and 
kissing them. They differed, however, in that 
nondirectivists were significantly less permis- 
sive in allowing a child to fondle them. 

While 14 limits were used differently by the 
therapists of „the present sample, the fact 
remains that practictioners of all schools con- 
curred in the use of a large number of limits. 


SUMMARY 


This study aimed to discover whether limit 
setting in play therapy 1s related to the thera- 
pist’s theoretical orientation. Responses to a 
54-item questionnaire on limits have been re- 
ceived from 100 psychoanalytic, 41 non- 


directive, and 86 “other” play therapists. 
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The results indicated that therapists of varied 
orientations did not differ in the number of 
limits used. While some significant differences 
were found in the kind of limits used, a 
considerable body of limits was employed 
by all. 
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THE CONSTRUCT VALIDITY OF THE EDWARDS 
PPS HETEROSEXUALITY SCALE 


E. JERRY PHARES anp CALVIN K. ADAMS 


Kansas State University 


The F 
(P he Edwards Personal Preference Schedule 


Used ms appears to be increasingly 
Ports to hon diagnosis and research. It pur- 
ing out one 15 personality needs grow- 
Standing har, A. Murray’s work. The out- 
bea ore S aracteristics of the PPS appear to 
Contro] i. -choice item format, an attempt to 
r social desirability in item choice, 
Feds which are not obviously or 
il connected with clinical pathology. 
Scales Present time, however, many of the 
Atically į af the PPS have not been system- 
the Ar investigated. An exception would be 
i Vestiga tar aent scale. Several studies have 
Achieve ed the relationship between the PPS 
Measure wnt Scale and the McClelland n Ach 
bach, ¢(Bendig, 1957; Himelstein, Eshen- 
lowe, 195g. D 1958; McClelland, 1958; Mar- 
"8 have in? Melikian, 1958). Still other stud- 
me o investigated the construct validity of 
(957 the scales. Bernardin and Jessor 
Scale Sr ailized the Autonomy and Deference 
Mq Bene the PPS as a measure of dependency 
4 those red confirmed the construct validity 
ieee Bien ae Gisvold (1958) using > 
shi avior fou ae aon eames of conformity 
ye betwe nd a significant negative relation- 
on Te ie it and the PPS Autonomy scale. 
2 lormit Nship between PPS Deference and 
Ucke Y behavior was found, however. 
Test Man and ; 
low. (@ meas Grosz (1958) used the Sway 
Ay avers ure of suggestibility) and found 
fer tOmy s Scored significantly higher on the 
Se ang cale than high swayers. The De- 
8 e Succorance scales did not differ- 
Scop Pellio groups. Zuckerman (1958) defin- 
hotes coj ness on the basis of sociometric 
ge Poth, ould find onl i t for the 
Core Esis th i y partial suppor ; 
Bere Bh on at rebellious subjects wou 
Sion « the Autonomy, Dominance, and 
cales of the PPS. In another study 


necessari 


of PPS n Ach, Worell (1960) demonstrated 
that subjects high on this scale showed sig- 
nificant superiority over low subjects in two 
verbal learning situations. In a factorial in- 
vestigation of the entire PPS, Levonian, 
Comrey, Levy, and Proctor (1959) found a 
discrepancy between what the PPS is de- 
signed to measure and its actual factorial 
item content. 

The present study is an attempt at the 
construct validation of the Heterosexuality 
scale of the PPS as regards males. This vari- 
able is described by Edwards (1954) as 


follows: 


To go out with members of the opposite sex, to 
engage in social activities with the opposite sex, to 
be in love with members of the opposite sex, to kiss 
those of the opposite sex, to be regarded as physically 
attractive by those of the opposite sex, to participate 
in discussions about sex, to read books and plays 
involving sex, to listen to or tell jokes involving sex, 
to become sexually excited (p. 5). 


Therefore, given this definition of hetero- 
sexuality how might males who score high on 
this scale behave differently from males who 
score low? Or, stated another way, will be- 
havior in experimental situations developed in 
the light of the characteristics of such a het- 
erosexuality construct relate to behavior on 
the Heterosexuality scale of the PPS? If so, 
the construct validity of the scale will have, 
in part at least, been supported. 
jects for this study was based upon 
the administration of a scale to 170 males in two 
psychology classes at Kansas State Uni- 
versity. The scale consisted of the 28 items of the 
PPS “Heterosexuality scale plus 22 buffer items 
selected at random from other PPS scales? From 
peace 


Selection of sub 


general 


is some evidence that removing items from 
the context of a standardized test may alter the 
nature of the items and responses to them (Edwards, 


Wright, & Lunneborg, 1959). 


1 Ther 
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this procedure a high heterosexual group was drawn 
composed of 20 subjects with scores between 16 and 
28. A low group contained 20 subjects with PPS 
scores between 1 and 8. The same two groups of 
subjects performed in both phases of the study to 
be reported below. 


PHASE 1 


Relationships between needs and esthetic 
preferences would appear logical ones given 
the pervasive role needs are assumed to play. 
Such a relationship between n Achievement 
and esthetic preferences has been investigated 
by Knapp (1958). For our purposes, if high 
PPS heterosexual subjects find sex and sex 
related activities more congenial and pleasur- 
able than do low PPS subjects, it seems rea- 
sonable to conclude that their esthetic pref- 
erences for photographs might reflect this. 
Therefore, it was predicted that high PPS 
heterosexual subjects would place a higher 
esthetic value on photographs involving sex- 
ual elements in varying degrees than would 
low PPS heterosexual subjects. 


Procedure 


Sixty black and white photographs were selected 
principally from various art and photography maga- 
zines, but also from other sources. On a common 
sense basis, the photographs were classified as either 
Sexual or nonsexual. The former consisted of such 
subjects as female nudes, female facial portraits, 

cheesecake,” boy and girl holding hands or kissing, 
ctc. The nonsexual Photographs ranged from sky- 
lines of New York to street beggars, animals, chil- 
dren, ete. Each Photograph was then examined by 
three judges who independently determined its sexual 
or nonsexual character, Using unanimous agreement 
as a criterion eight photographs were eliminated as 
ambiguous. Next, 47 unselected males were drawn 
from general psychology classes at Kansas State 
University. In three group sessions of about 15 sub- 
jects each, the remaining 52 photographs were con- 
secutively projected onto a screen. The subjects were 
asked to rate each photograph on a five-point scale 
along a dimension of esthetic value. This enabled a 
value to be assigned to each photograph obtained 
by summing the ratings of the subjects and dividing 
by 47. 

The remainder of the pre-experimental work con- 
sisted of arranging the above photographs into seven 
groups of six photographs each. Each group of six 
contained four nonsexual and two sexual photo- 
graphs, all of approximately equal value. This pro- 
cedure eliminated 10 photographs whose values were 
either too high or too low to easily fit into any of 
the seven groupings. Following this, each group of 
six photographs was pasted onto 21” X 28” white 
cards in two rows of three each. The positions of 
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the sex photographs were randomly distributed 
throughout the series, % iy 
In the experiment itself, subjects in both grou 
were told they were participants in a study to de. 
velop an art appreciation test and were asked ® 
to rank the photographs in each group from ont 
to six in terms of how artistically pleasing o 
were. Subjects were allowed a maximum 0 hi 
minutes to examine cach group of six, whereup? 
they made their rankings and the next group re 
six was presented. Each subject was given a sco 
which consisted of the sum of his rankings 0 
14 sex pictures. 


Results 


The mean and standard deviation of ie 
high heterosexual group were 45.2 and 12% 
respectively, while the corresponding ne 
and sigma for the low group were 57.6 ag 
9.1.” This mean difference of 12.4 with a d 
of the difference between means of 3.5 ye 
a ¢ value of 3.5, significant at the .001 ae 
It is interesting to note also that only " 
cases in the high group exceeded the me n 
of the low group and only one case exce? 
the ninetieth percentile of the low group., 

From these data it is clear that subje 
high on the PPS Heterosexual scale plac 
higher value on photographs involving sex : 
elements than do subjects low on that sca 


PHASE 2 


In this study it was hypothesized that oe 
jects high on the PPS Heterosexuality He 
would show greater retention of materia mv 
couraging the importance of sexual gee ell” 
tion, dating, and sex education in the pres 
tion of mental illness than would low wae 
Heterosexuality subjects. This prediction jd 
based on the assumption that subjects a of 
better learn and retain material supp!” or 
their needs than material opposed to t e ot? 
that contravaluant material should be ™ ip 
disruptive of the retention process than als? 
portive material. This prediction WOM and 
be consistent with research on attitu ee 
their role in learning and retention gar 
1941; Levine & Murphy, 1943). 


Procedure a! 


P d Ot 
As a warmup, subjects were given a brief = ce 
digit span test. Subjects were then read t“ jot 
Passages. Following each, they were given me 
to write down as much of it as they could 7°?" 


ic 
? The higher the score, the lower the esthett 


p 


7 a Validity of EPPS Hetcrosexuality Scale 


Unt or neutral passage was Memory Selection 
ag oe rom the Wechsler Memory Scale (1945). The 
Second selection was the sexual passage which read: 


' pe Pa Rogers/ the eminent/ New/ London/ 
o AESI recently/ spoke/ on the importance 
o a education/ and dating/ in the prevention/ 
aa illness./ He urged/ males/ to learn/ 
alio st: more about sex/ and sexual functions./ He 
and meee the importance/ of active dating/ 
pines / Thee with girls/ in promoting/ hap- 
increas he speech was/ very/ well received/ and 
Sed/ his stature/ as an authority/ on sex./ 


Each sub 


Tecalleq jects score was the total number of units 


total Rit the sexual passage subtracted from the 
Gimibate on recalled on the neutral passage. To 
added to negative numbers a constant of 10 was 
he Contre, o scores. This procedure enabled 
Sence and of such extraneous variables as intel- 
Counted fo learning ability which might have ac- 
Twelve r differential retention in the two groups. 


ty 

S Oring nr 
Senio 
mg t 


Results 
a $ 
vit © mean retention score and standard de- 
10.95 of the high heterosexual group were 
‘Donding o 3.8, respectively, and the corre- 
§ Mean and sigma for the low group 
ed and 2.8.5 The difference between 
nea 1 JS thus 1.05 and the SE of the dif- 
3 fbi means 1.06. A £ test gave a 
» is in 906. This result, while not signifi- 
the predicted direction. 


he h DISCUSSION 

weg jz Pothesis of the first study was con- 
Ond = a highly significant manner. The 
Udy produced nonsignificant results 
re the expected direction. In view of 
rete sideration, a f test was run again 
an vas ae data after eliminating” the 
a the fiy Scoring high heterosexual subjects 
of fashion highest scoring low subjects. In 
Se 16, tio three subjects with PPS scores 
toys of g ih scores of 17, and five with 
extre, Were thrown out thus making 

: eme groups of 15 each. The number 


es elim; 4 
liminated was determined arbitrarily 


igh, , 
Material ®t the score, the lower the retention of 
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without reference to any distribution char- 
acteristics. Nor were there any “holes” likely 
to inflate a trend. As it happened, this pro- 
cedure entirely eliminated all subjects with 
scores of either 16, 17, or 8—there was no 
need to “choose”? which subjects to include 
and which to exclude. This procedure resulted 
in mean retention scores of 10.0 and 12.2 for 
the high and low groups, respectively. The ¢ 
value, based on a mean difference of 2.2 and 
a SE of the difference between means of 1.15, 
was now 1.9, significant at the .06 level. 
Matching the groups pair-wise on the basis 
of their scores on the neutral passage and 
assuming the discrepancy scores to be corre- 
lated, lowered the obtained p level to only 
05. These latter findings suggest either that 
the Edwards Heterosexuality scale is some- 
what nondiscriminating except at the ex- 
tremes or that our retention measure was not 
as sensitive as it might have been. In retro- 
spect it seems that we could have constructed 
a more threatening or need-engaging sex par- 
agraph or perhaps a measure more easily 
capable of being scored for distortion. 

‘As a test of whether the two criterion meas- 
ures were related within subjects, subjects’ 
scores on the retention task were correlated 
with their scores on the photograph task. The 
correlation was only .155. Again however, 
eliminating the aforementioned 10 subjects 
raised the r to .29. This coefficient is not sig- 
nificant, but in view of the increase may sug- 
gest a modicum of generality, especially as- 
suming a more sensitive retention measure. 

All in all, the strong evidence provided by 
the results of the first hypothesis and the 
suggestive evidence of the second hypothesis 
appear to give support to the construct valid- 
ity of the Edwards PPS Heterosexuality scale 
as regards males and underscore particularly 
its potential usefulness for research purposes. 


SUMMARY 


In this study the Edwards PPS Hetero- 
ty scale was investigated with respect 


exuali 
i onstruct validity. Groups of high and 


to its € a E 4 
low scoring males were utilized. Two investi- 
gations, one involving esthetic preferences 


and the other retention scores, were carried 
out. In the first case it was found that high 
PPS males placed a significantly higher es- 
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thetic value on sexual photographs than did 
low subjects. In the second study suggestive 
evidence was found that high subjeets exhibit 
better retention of sexual material than do 
low subjects. 

Generally, the evidence appears supportive 
of the construct validity of the Edwards PPS 
Heterosexuality scale. 
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A co i 
sonality yc hensive, empirical system of per- 
mélyses a dimensions based on factor 

ae describa battery of objective tests has 
tell, 1 Sa: ed in previous publications (Cat- 
Phasis in tf e 1957a). The initial em- 
the o aan Dient of this system has 
bat is facto ish these as first-order factors, 
agen te s which are based on correlations 
ply to a and which therefore adhere 
z tests, R e detailed information available 
pation poentig as first-order factor de- 
m aitting as attained increasing replication, 
A Slirement ir exact conceptualization and 
tone ton p it has become possible to shift 
ore, ase sasha | factors, that is fac- 
tr Stone the correlations between first- 
Mentions = (Catteni, 1956, 1957a). These 
Whi Uuring th ay be determined either (a) by 
tele | Case ih factors by factor batteries, 10 
frp bility and ey are attenuated by test un- 
in p the plot disturbed by invalidity, or (4) 
thes’? fitsteop of rotations to simple structure 

F iS no rder domain, In the latter case, 
att Cisturpae ation effect and relatively 
ion ig goaa by invalidity, if and as the 

t ee and enough variables are €m- 
tak ne the hyperplanes. Research 
ee both approaches, since the 
Of e ensus throws light on the mag- 
in p or. Both approaches are con- 

IS article, 

go ed analysis is a parsimonious 
ang The ere will be fewer second-order 


Opini, 

O: *, 

ay Ot ny NS expressed are those of the writers 

vy “eCessari} f 
f: rily shared by the Department © 


than first-order factors, just as there are 
fewer first-order factors than there are tests 
(Cattell, 1952). Second-order factors may 
therefore be looked upon as relatively broad 
descriptive categories, interpretable as repre- 
senting general organizing influences in per- 
sonality. Due to the limited number of ex- 
planatory concepts which the human mind 
seems able to juggle at one time, as well as 
to the analytic defects of behavioral analysis 
at the level of premetric general and clinical 
observation, the concepts of most psychol- 
ogists fit factors at the second-order rather 
than the first-order level. For example, the 
personality concept measured by the ques- 
tionnaire second-order extraversion factor in 
the 16 PF (Cattell, 1956; Cattell, Saunders, 
& Stice, 1957) is probably referred to more 
often than are concepts measured by the first- 
order components: Cyclothymia, Surgency, 
Parmia (Cattell, 1957a; Cattell, Saunders, & 
Stice, 1957). Both levels have their use and, 
although secondaries (second-orders) lose 
some of the predictive power of primaries 
(first-orders); knowledge of the second-order 
structure is extremely important for under- 
standing personality, developmentally and in 


action. P : 
achieved by research to date, 


The position 1 
in respect to orders and media of observation, 


js as follows: 

1. The establishment of first-order person- 
ality factors in the questionnaire (verbal, 
self-evaluative) and life record (behavior 
rated in situ) media of measurement (e.g. 
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Cattell, 1957a; Cattell, Saunders, & Stice, 
1957). 

2. The tentative establishment of first- 
order personality factors in the objective test 
(or T) realm (Cattell, 1957a; Cattell & 
Scheier, 1959; Scheier & Cattell, 1958)— 
where “objective” tests are understood as 
being tests which are relatively disguised in 
purpose, difficult to fake, and which are based 
on the subject’s performance in miniature 
situational tests rather than self-report (Cat- 
tell, 1958; Scheier, 1958). 

3. The establishment of second-order per- 
sonality factors in the questionnaire (or Q) 
medium of measurement (Cattell, 1956). 

4. The establishment of some matches 
across Q and T media. After separate factor- 
ization within each of the two media (in 1 
through 3 above), factor scores are correlated 
to seek matches in the two series. Results in 
this area are not yet definitive, but they 
strongly indicate that (a) at least four sec- 
ond-order factors in questionnaires match 
(i.e., measure the same dimensions as) first- 
order factors in objective tests (Cattell & 
Scheier, 1959, 1961; Scheier & Cattell, 1958), 
and (b) even when the entire second-order 
questionnaire factor realm is accounted for 
and matched as best one can with objec- 
tive test factors, all but five or six of the 
objective test factors lack substantial ques- 
tionnaire association—i.e., they do not have 
questionnaire factor equivalents at either the 
first- or second-order Q level. A possible in- 
ference is that many objective test factors are 
getting at areas of personality which ques- 
tionnaire factors do not, and probably never 
can, measure. 

5. First explorations have been made of 
the second-order factors among the objective 
test medium primaries. As Paragraph 4 above 
indicates, these might be expected to repre- 
sent broader influences than those in ques- 
tionnaire and rating primaries. The present 
paper concerns itself with organizing the evi- 
dence from these recent studies.? 


2 Eventually, second-order objective test factors 
will have to have their relations checked with first- 
and second-order Q factors. Presumably, if the evi- 
dence of Paragraph 4a above holds up, no direct 
matches will be found because second-order objective 
test factors occur at a higher order than any known 


METHOD 


The researches available for collation are five ” 
number, four being based on method of analysis Q 
and one on method (a) as described in the first pa 
graph of this paper. All studies operated upon ™ 
dividual difference patterns—i.c., they were not indè 
mental or P technique studies concerned with stolt 
factors, but rather, R technique analyses concer 
with trait factors (Cattell, 1952, 1957a). For dos. 
comparability, not confusing population different“ 
with experimental error, the studies considered in, 
present paper were all based on young male Ameri, 
adults. For ease of reference in subsequent tables 7 
discussion, they are referred to by temporary symb’: 
used in other publications from the senior aU 
laboratory (Cs, Cs, etc., below). These studies area 
scribed in necessary detail elsewhere (Cattell, 19 ‘lh 
Cattell & Scheier, 1959, 1961; Scheier & Call, 
1958), but the essential characteristics are as follow 


Cs. 500 United States Air Force males measured ‘t 
128 variables. Rotation to oblique simple structure g 
the first-order yielded 16 factors (Cattell, 1955) “D 
was followed by a second-order analysis (Method is 
of the correlations among first-orders (see Appen 
12 in Cattell, 1957a). on 

Co. 250 United States Air Force males measured at 
64 variables. Rotation to oblique simple structuré d 
the first-order yielded 15 factors (Cattell, 1955b) # ) 
was followed by a second-order analysis (Meth? dis 
of the correlations among first-orders (see Appen 
12 in Cattell, 1957a), the 

Ri, Re. Two studies each of which measured nd 
same 86 male college undergraduates, with 120 -on 
103 variables, respectively (69 of which were conde 
to both studies). Rotation to oblique first- & 
simple structure, reported on elsewhere (Catt f 
Scheier, 1959; Scheier & Cattell, 1958), yieldet: py 
spectively, 15 and 17 factors, and was follow? jot 
second-order analysis (Method b) of the corel tor 
among first-order factors, These two secon” 
resolutions are published here for the first time: atin? 

Nı. 315 United States Navy male Subm rde 
School candidates 3 were scored on 18 first tet 
objective test factors as measured by the O-A Ba er 
(Cattell, 1955a). That is, in the N, study €3° cow 
son’s score on each factor was obtained from e qon! 
bination of test scores demonstrated, bY cor pes! 
and substantial loadings on each factor, to giV€ wer 
possible estimate of that factor. Correlation® od 
computed among these battery scores Me ed 
above), and factors were then extracted an 


stead of the Cr matrix (Cattell, 1952) derive 


= ae Teed) ‘a grt 
Q factors, namely at a third-order, relative ; 
order Q factors, tal? 


ê These data were collected at the united ond”, 
Naval Medical Research Laboratory, New, ‘yest 
Connecticut, under Bureau of Medicine 2? 

Project NM 23 02 20. 


Second-Order Personality Factor Structure 


obli i 
blique rotations, as was the case in the other four 
studies, 


i otter factorization in all of the ñve studies 
scribed i of completeness of factor extraction de- 
structure where (Cattell, 1952) and pursued simple 
count pro indly to a plateau at which hyperplane 
Patterns va unimprovable. At that point the factor 
(correlation o inspected and a process of matching 
AGEOr pA loading patterns) was begun, factor 
matching reer each of the other four studies. This 
Sults in th reen tolerably good convergence of re- 
eir thee ve studies and the matched factors and 
through Fes are therefore set out in Tables 1 
order obje elow, cach table representing a second- 
e reason, ne lest factor which we now believe to 
the sq one Y well confirmed. All five studies replicate 
Siven Risin ne second-order factors, and although a 
“Nother, p r is often less clear in one series than in 
Satistacty R alternative matching would be nearly as 
ty as that presented. 


e lay 
Ventio layout of these tables follows the usual con- 
ns, as follows: 
LR 
a A j 
for ay table is headed by a contingent verbal title 


€ Ver 


becg 


future studies, Each verbal title is bi- 
e ae sitive (high score) pole being on top and 
+ First. Sative in parentheses. 

ea habeas factors—the “variables” at this order 
ne t aes identified in the left-hand column of 
me “i by their Universal Index or UI num- 
Publishe cll, 1957b), which are constant across all 


ee e ete: and (b) by their verbal titles. 

coi Bs arenes in Columns 3 through 7 are the 
Seay ein, aoe on the second-order, each 
diy, ch in whic csignated by the symbol for the re- 
Ri a tom a the results were found. These values 
Dose Rs, by e true values, in the case of Cs, Co, 
the tons in ay reason of imperfect simple structure 
the Sas rst- and second-order relations, and in 


facto, 1 by attenuation due to unreliability of 
“tleries, Inconsistencies of sign between 
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one study and others are indicated by enclosing the 
atypical value in brackets. A positive sign on a load- 
ing means that the positive pole of that factor, as it 
is standardly written in various textbooks (Cattell, 
1957a; Cattell & Scheier, 1961) goes positively with 
the second-order factor pole as written, and oppo- 
sitely for a minus loading. Since, in reading titles 
on the left, readers like to see at once which “go 
together,” we have followed the practice of writing 
in the pole which is consistent with the positive direc- 
tion of the second-order factor. That is to say, when 
a loading in the numerical columns is negative (for 
the positive pole of the first-order factor), we have 
already reversed the title to put the negative pole of 
the factor in this title column, Therefore, the verbal 
title of the first-order already gives the direction in 
which (pole at which) it is related to the second- 
order factor. The signs on the loadings merely tell 
whether this is the positive (high score) or negative 
(low score) pole of the first-order, as it is usually 
scored and thought of. 

4. The second column gives the average loading, 
which, by reason of the attenuation in Ni, would be 
expected to err systematically, slightly below the true 
value. The given rank order in listing the primaries 
is not literally the declining order of their mean load- 
ing on the second-order, but an estimate of the order 
of importance of the primaries, made by taking into 
account also the consistency and frequency of replica- 
tion of the result. 

3. The tables report data only on the first-orders 
which are most highly and consistently associated 
with second-orders. The full tables, showing the load- 
ing of all first-orders present in each study, including 
those with lesser to essentially zero relationship, are 
preserved in tables available from the American 


Documentation Institute.* 


~4 Complete loading tables for all first-order factors 
have been deposited with the American Documenta- 
tion Institute. Order Document No. 6758 from the 
ADI Auxiliary Publications Project, Photoduplication 
Service, Library of Congress; Washington $5; D/O 
remitting in advance $1.25 for microfilm or $1.25 
for photocopies. Make checks payable to: Chief, 
Photoduplication Service, Library of Congress, 


TABLE 1 


ABSENCE OF CULTURAL INTROJECTION) 


SQ rep i 
ie SMT: TIED SOCIALIZATION OR SUPEREGO (vs. 


Loadings in Studies 


ik N N 
5 Average Cs ve wi Be M 
Urg = _ —= 
0 
Comention, Conformity to 62 +46 +03 434 
Ula << ultural Standards ` +36 Ge 13 +22 +05 
z 19 updo neral Intelligence E 2 <37 —09 —06 —20 
as Stbduednes —20 - 52 ae oF 
UI 35 go tate Rei —33 —04 +4 2 3 
Ur 32 elf-Reliance +44 38 —01 =3i =<t6 
“Xtraversion —19 =i = gi 
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TABLE 2 


F(T)II: Expansive Eco (vs. History or DIFFICULTY IN PROBLEM SoLvinG) 


Loadings in Studies 


First-Order Factor Average Ci Ce R, Re Nı 
UI 16 Harric Assertiveness +34 +47 +56 +32 +36 00 
UI 1 High General Intelligence +28 +42 +13 

UI 19 Promethean Will +23 +18 +33 +21 +06 +37 

UI 23 Ergic Regression —21 —08 —06 —48 
UI 36 Self-Sentiment Development +29 +29 : 

UI 18 Naivete =15 o 8-25-03 6 -2 
DATA AND INTERPRETATION FOR critical?), but the central factors are ul 
SECOND-ORDER FACTORS 20, Comention (conventional acceptance © 
~ Factor F(T)I, Tied Socialization, has Values) and UI 28, Rigidity of Supete®” 


claims to being the largest second-order fac- 
tor. There is high internal psychological con- 
sistency (with the possible exception of low 
intelligence) in terms of the characteristics of 
the first-order factors involved. A clinician 
might well claim it to be proof of a broad 
superego organization, and some solid support 
for this lies in the finding (Cattell & Scheier, 
1961) that neurotics are significantly higher 
than normals on this second-order factor. 
However, in the label “Tied Socialization” 
we have favored an hypothesis which may 
specialize to a superego definition, but which 
is at present broader. It implies that this 
pattern represents the extent to which the 
individual has accepted the culture patterns 
and standards of the group. It has overtones 
of extraversion and realistic contact, subdued- 
ness, receptivity, and lower intelligence (un- 


which we believe the contingent title apt 
designates. t 
Except for UI 23(—), there is again cons! 
erable psychological consistency in F(T h 
here in the sense of willfulness (“will powe" 
and a high development of the self-sentime” 
Once again, too, the clinician might an 
that this particular clustering of first-" D 
factors gives support to an observational ©? 
cept at the second-order level, in this Ca 
the concept of a broad dynamic organizat 0 
that is called “ego development” oF a 
strength.” The essential unifying cones 
covering higher intelligence and the ass" y 
characteristics, appears to be, if an envir of 
mental explanation js adopted, a history i 
success in managing ergic (drive) sat y 
tions. This interpretation would also fit Tei 
choanalytic concepts of ego strength. 


TABLE 3 


F(T)III: TEMPERAMENTAL ARDOR (vs. TEMPERAMENTAL 


APATHY) 


Loadings in Studies 


First-Order Factor Average Cs Ce Ri Re Ni 

UI 21 Exuberance, Energetic P 
Spontaneity +31 +70 +40 +23 ad F 

UI 20 Comention, Conformity to X 
Cultural Standards +27 +48 +2 +08 + 

UI 1 Low General Intelligence —28 —15 Ai i 

UI 19 Promethean Will +21 +21 a) +18 +54 i 


UI 27 Alert Control 
UI 32 Extraversion 


(10) —55 
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TABLE + 


= 
Sea 


F(L)IV: Hich Epucatep SELE-Consciousness (vs. Low EDUCATED SELF-CONSCIOUSNESS) 


Loadings in Studies 


First-Order Factor Average Cs Ce Ri Re Ni 

UL 22 Corticalertia, Cortical 
y Alertness +31 +64 +40 +08 (—02 +47 
L18 Shrewdness +28 +66 +24 +40 +07 +01 
= 5 Imaginative Tension +17 +45 72 +02 nOg 
T Self-Sentiment Development +51 +51 5 pn 
UI 0 High General Reactivity —16 —03 —08 (+05) are! 
Ur 29 Low Adaptation Energy —15 —01 —08 —53 (+04) 
33 Dourness +20 +20 


chief ac: 

ia Criterion evidence on this factor shows 
Inguishing eyen more powerfully than 

etween neurotics and normals, the 

being significantly higher than neu- 
Expansive Ego (Cattell & Scheier, 


ual to pick out those factors which individ- 


dete PPear to have relatively high genetic 


r ; 
muliya ation Previous 
rist, atlate experiments (Cattell, Stice, & 


Îtaterna 


and environment may make to cer- 


it app, the objective test factors. In particular 


— -e 


tiner in od that heredity was the main deter- 
ne Uy F(T)III Factors UI 1, Intelligence, 
“omental » Comention, Hereditary and envi- 
lin q Influences appeared to be about 
n Win omining Factors UI 19, Prome- 

nA” and UI 27, Alert Control, which 
e on F(T)III. Conceivably, these 

terminations all arise from a single 


genetic influence covering all primaries con- 
tributing to this second-order factor. How- 
ever, it is difficult to imagine in polygenic 
determination how a series of genes would 
happen to coincide in repeatedly affecting 
several things in the same way. This is a pro- 
vocative finding for psychological genetics. 
Meanwhile, we shall label the second-order 
“Temperamental Ardor,” since, except for 
UI 20, the central character is a willful exu- 
berance and ardor of temper. Consistent with 
this interpretation, F(T)III has been found 
to be significantly higher for hospitalized 
neurotics of sociopathic type than it is for 
normals (Cattell & Scheier, 1961). 

F(T)IV tends to be loaded by what appear 
to be largely environmentally determined fac- 
tors. In previous investigations (Cattell, Stice, 
& Kristy, 1957) environment appeared to be 
the main determiner in F(T)IV Factors UI 
22, Corticalertia, and UI 29, Low Adaptation 
Energy-vs.-Overresponsiveness. It probably 
has something to do with education in the 
sense of developing alert, shrewd, and imag- 
inative qualities. It also involves much ex- 


TABLE 5 


F(T)V: History or INHIBITING, REST 


RAINING ENVIRONMENT (vs. LAXNESS) 


Loadings in Studies 


Pi Ri Re N: 
First-Order Factor Average Cs Cs 
g +45 +03 +47 
Ur i Inhibition +36 +32 +54 
A igh Mobilization of +25 +19 +10 
t31 Esources +18 î (12 +52 
ary Realism +16 +0 
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TABLE 6 
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F(T)VI: Narcistic DEVELOPMENT (vs. ENVIRONMENTAL CONTACT AND INVESTMENT) 


Loadings in Studies 


First-Order Factor Average Cs Ce Ri Ra Ni 
UI 26 Narcistic Self-Will +33 +59 +31 +50 +25 00 
UI 27 Apathy-Fatigue +30 +36 +23 
UI 34 Autistic Nonconformity +51 +51 


plicit self-awareness and “canniness.” Possi- 
bly, an upbringing in a critical, competitive, 
high-standard home atmosphere could gen- 
erate such a pattern. The above interpretation 
is, at least, not inconsistent with the fact that 
neurotics with marked sociopathic trends 
have significantly lower scores than normals 
on this factor (Cattell & Scheier, 1961). 

Our hypothesis is that F(T)V represents 
a dimension of personality resulting from an 
environment in which considerable inhibition 
prevails. The somewhat unexpected role of 
UI 23, here as in F(T)II, suggests that there 
may be some need to modify our hypotheses 
about this primary. However, it is reasonable 
to interpret UI 23 as unused reserves of en- 
ergy and it might then fit here as the cum- 
ulative result of inhibited, undischarged re- 
activity. 

F(T)VI is a rather narrow factor which 
nevertheless has a consistent character of 
narcistic and autistic development. It is in 
some sense a false ego development, a moving 
out of contact with reality. This hypothesis 
is supported by its being significantly high 


In F(T) VI, we may have a form of ter 
sion wider than Free Anxiety, UI 24, a 
therefore perhaps representing total drive ten 
sion as it is oriented and controlled towa" 
achievement. On the other hand, an argume? 
could be made from present inclusions ni 
exclusions that the central influence in a 
factor is insecurities connected with the sell: 
concept. We shall keep interpretive ove 
ities open by a descriptive title indicat!” 
anxiety and related drive tensions centered 4 
insecurity, and under control of an achiev? 
ment goal. n 

Now that the patterns of second-order ° 
jective test factors have been presented an 
tentatively interpreted, it is logical to ask EL 
very broad third-order factors can be nie 
Accordingly, we computed the correlation, 
among second-order T factors, in the pa 
most recent studies (Ry, Re, Ni) and et 
aged these values in Table 8. These on 
tions among second-orders are generally y d 
low. Over half of the 7’s are .10 or less; 4 1, 
only one, between F(T)IIL and F(T) ik 
could confidently be called statistically S8 pt 


in neurotics with sociopathic trends, but not icant. It is possible that a third-order Te 
in typical neurotics (Cattell & Scheier, 1961). be found, including the second-orders of 
TABLE 7 
I (T)VII: Hich TENSION TO ACHIEVE, CONTROLLED Drive TENSION LEVEL 
(vs. Low TENSION TO ACHIEVE) 
Loadings in Studies 
= ee 
First-Order Factor Average Cs Ca Ri Re Ni 
UI 24 Free Anxiety +40 +38 +38 +52 442 ga 
UI 18 Shrewdness +23 +14 +38 434 418 Be 
UI 30 High General Reactivity —19 —56 —12 00 vi 
UI 25 Imaginative Tension +18 +49 414 (-07) +i 
UI 19 Promethean Will +16 +13 +10 407 405 + 


Second-Order Personality Factor Structure 


TABLE 8 


CORRELATIONS AMONG SE 
OBJECTIVE-TEST 


yND-ORDER 


Tom our wW 


=.10 —.06 —.17 -19 


—.18 07 

-01 —.07 —.17 07 2 
=18 =A 38 —.14 
—.08 —.06 .05 

—.08 —.07 

—.04 


Note, a 
Average over Ri, Ra, and N, studies. 
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Sree Ardor—F (T)11I—and_Narcistic 
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f second-order results. However, 
> it is possible to make a general 
lique facto that, for practical purposes, ob- 
itely lar "zations do not permit an indef- 
istorinio., series of higher-and-higher order 
Come ear The correlations definitely be- 
is cals v one moves to higher orders; 
eneral] relations among second-orders are 
order, >» {Wer than correlations among first- 
E Mi correlations among first-orders 
tests, rally lower than correlations between 
b There is 


asic date «ne logical reason why the same 


ta should not be describable by using 
er ategories, alternatively at lower or 
being foretss with the broader categories 
Other panor but missing more detail. On the 
and disor it would be somewhat confusing 
i ions if this process were to prove 
ix and practically possible over, 54y, 
of t even a greater number of distinct 
generality (orders). The data of 

nee orde, trongly suggest an upper limit of 
ere as ts for objective test data, and, since 

cer dot Only four well-confirmed second- 
qtte E in questionnaires (Cattell, 1956; 
xi the cheier, 1961), it is almost certain 

“ders Y too will yield no more than three 
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nalit "relations among 20 first-order per- 
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a five separate studies employing 
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2. An oblique simple structure factoring, 
independently on each of the five correlation 
matrices, yields seven factors, the general 
form of which is confirmed by matching 
across the five studies. 

3. With relatively short supportive discus- 
sion, these factors have been indexed and 
named as follows: 


F(T)I, Tied Socialization 

F(T)II, Expansive Ego 

F(T)II, Temperamental Ardor 

F(T)IV, Educated Self-Consciousness 

F(T)V, History of Inhibiting, Restraining 
Environment 

F(T)VI, Narcistic Development 

F(T)VII, Tension to Achieve 


4, Space dictates restricting this presenta- 
tion to evidence of the patterns, their match- 
ing, and a few criterion associations. Else- 
where, a continuation will be made of the 
fuller theoretical development required by 
this first demonstration of consistent person- 
ality structure at the more pervasive level of 
second-order objective test factors. 

5. Possibility of third-order factors is 


briefly discussed. 
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AN 
eer study * is concerned with the 
nique to on the “level of aspiration” tech- 
‘ease, gj e study of persons with cardiac 
Scribeq eT 1930, when Hoppe first de- 
only six e level of aspiration phenomenon, 
Abstract, Studies cited in Psychological 
Conditions ate to physical or physiological 
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‘Sbiratio between adrenal activity and level of 
foun scores, Little and Cohen (1951) 
“antly M asthmatic children showed signifi- 
’Sthmatic® her levels of aspiration than non- 
tate ulce, Hecht (1952) was able to differen- 
eve] pp ct Patients from colitis patients On 
setting aspiration scores, the ulcer patients 
nificantly higher “D scores.” Sco- 
iytic i a also found differences between 
aN E patients and a neurotic (non- 
a as aa group on a level of aspiration 
a Brou as on several other measures; the 
(ees thay had lower level of aspiration 
he! in the neurotic group. Raifman 
tio We shut that a group of ulcer patients 
new than a cantly higher levels of aspira- 
roti A of normals and a group of 
i e present study will attempt to 
“eat dis Whether persons with hypertensive 
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persons with nonhypertensive arteriosclerotic 
heart disease on a level of aspiration task. 
Perhaps the best known effort to describe 
distinctive personality characteristics of per- 
sons with cardiac disease is that of Dunbar 
(1948) who described personality profiles for 
several types of patient. Alexander (1950) 
considers that probably the most valid of 
Dunbar’s profiles is that of the “coronary” * 
patient. Yet an objective study of 46 coronary 
patients performed more recently by Miles, 
Waldfogel, Barrabee, and Cobb (1954), using 
psychiatric interview, social history, and 
psychological tests, disagrees with or fails to 
confirm most of the specific characteristics at- 
tributed to coronary patients by Dunbar. The 
study did show, however, more strenuous work 
histories, with more physical and psychologi- 
cal stress and strain than the group of nor- 
mals; in this respect the study tends to con- 
firm that part of Dunbar’s profile which pic- 
tures the coronary patient as a consistently 
striving person who works harder and longer 


2 Jt will be noted that the word “coronary” in the 
introductory section has been used in quotation 
marks. This is because the term needs clarification; 
‘coro! nary” and “hypertensive” are not mutually ex- 
clusive types of heart disease. Dunbar noted, for 
example, that 27% of the patients with coronary 
occlusion in her study also had hypertension (Dun- 
bar, 1948, P- 251). Many hypertensives may even- 
tually have coronary occlusion or coronary insuf- 
ficiency- Miles, Waldfogel, Barrabee, and Cobb (1954) 
in their study of coronaries, limited the study to 
cases in which there had actually been a coronary 
attack—occlusion of the coronary artery—with an 


absence of hypertension. 
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than the average adult (Dunbar, 1948, p. 
307). Alexander (1950, p. 72) agrees with 
Dunbar on this characteristic. 

The person with hypertensive cardiac dis- 
ease is characterized by Dunbar as having in 
common with coronary patients a constant 
striving to subdue or surpass competitors, but 
as being different from coronary patients in 
that hypertensives have a greater fear of 
criticism, a greater fear of responsibility, a 
greater fear of falling short. They are more 
likely to choose occupations below their 
ability and are usually less successful than 
coronaries (Dunbar, 1948, p. 264). 

If hypertension is preceded by a history of 
being unsuccessful, fearful of criticism, or of 
responsibility, and of falling short, as sug- 
gested by Dunbar, it may produce a different 
pattern of response on a level of aspiration 
test than would be produced by a more suc- 
cessful, less fearful history. There is some 
support for this hypothesis in the literature. 
Sears (1940), for example, has shown that 
chronically unsuccessful children, as judged 
by school achievement, show a different pat- 
tern of response from successful children on 
a level of aspiration test. The pattern was not 
necessarily one of lowered aspirations, how- 
ever, as might be suggested by Dunbar’s 
assertion that the hypertensive is more likely 
to choose occupations below his ability. In 
Sears’ study the unsuccessful group produced 
a bimodal distribution, showing either an un- 
realistically higher level of aspiration or an 
unnecessarily low level of aspiration. The 
successful individual typically sets his goal 
near but slightly above his last previous per- 
formance. Kurt Lewin (Lewin, Dembo, 
Festinger, & Sears, 1944) in his review of 
the literature and again in 1948 affirms these 
as characteristic reactions of the successful 
and the unsuccessful individual in level of 
aspiration situations. 

In a recent theoretical article concerned 
with level of aspiration and risk taking be- 
havior, Atkinson (1957) has drawn a distinc- 
tion between those responses, or persons, 
whose motivation is to achieve success, and 
those whose motivation is to avoid failure. 
Those who seek to achieve success typically 
chose a task near their achievement level, 
i.e., where the uncertainty of success or failure 
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is greatest. Those who seek to avoid failure 
typically set their goals either very high of 
very low. Thus in a level of aspiration study 
we might be concerned with two kinds o 
responses: the achievement oriented and the 
failure oriented. These types of responses 
seem comparable to Scodel’s “typical” and 
“atypical” responses. Similarly, Scode 
(1953) divides the atypical responses into 
atypical high and atypical low responses. 

If Dunbar is correct in her assertions about 
persons with coronary heart disease and thos? 
with hypertensive heart disease, the tw? 
groups should be similar when achievement 
oriented behavior is considered. The responses 
of the two groups in their failure oriente 
behavior, however, should differ, according g 
her view that the hypertensives are mor’ 
fearful of failure. She also states that the 
direction of this failure oriented reaction Ë 
toward choosing atypically low levels ° 
aspiration—in Atkinson and Scodel’s tet™> 

Although the writings of Dunbar hav? 
given impetus to this general investigatio™ 
the findings of Miles et al. (1954), have chy 
considerable doubt about her specific 7 
potheses. The approach of the present E 
vestigation, however, is not to test Dunba" 
specific hypotheses. Rather, we feel that i 
status of the theoretical positions is such t E 
only an empirical approach is presently u, 
ranted. Thus, the questions asked in i 
present study are: do hypertensive oe 
cases differ from nonhypertensive, arter 
sclerotic heart cases in (a) the number a 
achievement oriented, aspiration responie 
and (b) the extent and direction of 
failure oriented responses? 


MrTHOD 
Subjects 


he 
All subjects were adult male veterans between, js- 
ages of 32 and 65, hospitalized at Veterans Ad™ this 
tration Hospital, Wood, Wisconsin. Patients jjinoisi 
hospital are from Wisconsin, Michigan, and "Mil 
the largest percentage of patients are from the 
waukee metropolitan area. jewet 
All new admissions to the hospital were rev" gio? 
and a list made of all those showing an a dises” 
diagnosis of heart disease or suspected heart "peen 
After the medical diagnostic procedures ba ward 
completed, the cases were discussed with the i of 
physician. Subjects with an established diagno" 
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« 
Hypertensive Heart Disease” were assigned to one 
Group; subjects with an established diagnosis of 
teriosclerotic Heart Disease” and an absence of 
ape asion were assigned to the second group. 
inated fi Wut other types of heart disease were elim- 
criteria am further consideration as not meeting the 
A ae defined groups. Among those meeting 
inations 5 sanal or categorical requirements no elim- 
tally or aa Mide except subjects who were men- 
study (fon ysically incapable of cooperating in the 
Vascular r example, subjects who have had a cardio- 
ie degen resulting in hemiplegia, or aphasia, 
wie of mental functioning). rs 
subject i a there is always the possibility that a 
has at = the nonhypertensive arteriosclerotic group 
owever ne time in the past been hypertensive. 
uch a , tom the standpoint of research design, 
ti Misclassitication would not invalidate a posi- 
8. Thus, if there is a psychological variable 
ith the presence of hypertension, and the 
Study indicates that the nonhypertensive 
the hypertensive group differ significantly 
i variable, the significance level would be a 
the y value; transfer of the misclassified case to 
nly oe group in which it belongs would 
Ssociat © ìncrease the significance level. n 
i y ed diseases found to be present coincident 
Seley i ety diagnoses of hypertensive or arteri- 
Used į cart disease were of interest, but were 
ch; n selecting the subjects because to do so 
ee the basis of selection and disturb the 
Sign of the study. 
c fi 
Placing tabject Was given tt 15-second trials on the 
Test, $ St of the Minnesota Rate of Manipulation 
The st as asked to set a goal prior to each trial. 
dination 3 Presented to him as a test of his co- 


T 

S task , 
R t ask was selected after consideration of a 
wl a actors, The literature on the use of the 
ariet, “Pitation technique reveals that a wide 


ai task tasks, both verbal and motor, have been 
aa cellation used, for example, include card sorting, 
tests addition, Pegboards,’ bowling games, 
test n was and the Rotter Aspiration Board. A 
(i9529? the pte in favor of the Minnesota Placing 

82) Bs following principal reasons: (a) Hecht 
eer anq a Significant differences between peptic 
Simp trd; the ative colitis patients using the Purdue 

Mae ern innesota Placing test is a somewhat 


Negus? Pe 

mene sg tmance-type test of coordination but 
tage (b) poser dexterity and less precise place- 
ig, S in l Owling games and other “game”-type 


iz e 1 
Cosa by Bai of aspiration studies have been crit- 
Maap? Stubpiee” Handelsman, Stewart, and Super 
‘ ateg t a “s mn (1950), and others as having too 
ia ret © lite ao” atmosphere and not being closely 


bengen i e “tuations, This has led to the practice 
Cim 16s op T f aspiration studies of introducing 
lask has any *r extraneous motivation. If this criti- 
t houd se alidity, it might be that a game-type 
Mes. Slder in, Svan less meaningful and challenging 

My, On t ‘Viduals with which the present study 


“Sota PI Ather hand, it was our feeling that the 
cing test would prove to be a meaning- 
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ful task when presented as “a test of coordination” 
to a hospitalized cardiac patient who has just come 
through a cautiously graded progression in amount 
of physical activity permitted and who is very much 
aware of and concerned about his physical limitations. 


Test Instructions 


Standardization of instructions was considered 
especially important in this study because it is 
known from previous studies of level of aspiration 
technique, such as those of Irwin (1942), of Walder 
(1951), and of Saji (1951) that the form of the 
stimulus question affects the results. Thus, “What 
score do you expect to get?” “What do you hope to 
get?” or “What will you try to get?” may bring 
different responses. Using Frank’s (1941) definition 
of level of aspiration as, “the level of future per- 
formance in a familiar situation which an individual 
explicitly undertakes to reach,” we have used as our 
stimulus question: “How many blocks will you try 
to get in on the next trial?” After the initial trial, 
the stimulus question was shortened to: “How many 
will you try for this time?” 


Age 

Age has not been shown to be a significant factor 
in level of aspiration studies; studies of this factor 
which have been done, such as those by Adams 
(1939), Walter (1948; Walter & Marzolf, 1951), and 
Reissman (1953), have tended to show that level of 
aspiration behavior is not significantly related to age. 
However, some of the studies did not examine age as 
a continuous variable but merely compared “old” 
groups with “young” groups; other studies which 
did examine age as a continuous variable have 
usually considered only a very limited range of ages. 
Because there has been insufficient research on age 
as a variable in level of aspiration behavior, and be- 
cause age is known medically to be related to arteri- 
osclerotic heart disease, it was our feeling that it 
should be controlled in the present study. The age 
of each subject was therefore recorded for compar- 


ison (see Results section). 


Stage of Recovery 
Although the effect of stage of recovery on level 
of aspiration behavior is unknown and uninvesti- 
it seemed desirable to control this factor. The 
experimental task was light sedentary activity in- 
volving only 15 seconds of activity at a time and an 
all time of only about 15 minutes; yet it re- 
d movement of the arm, and might be 
ither a real or a perceived threat in the early stages 
a overy from cardiac disease. Thus no patients 
m fe scheduled until they reached the “ambulatory” 
bo of recovery, and this was approximately the 
s 


gated, 


over- y 
quired rapi 


same for all subjects. 


Education 
ation for the Hypertensive group 
average education ior t 3 
The ith a standard deviation of 2.0, and a range 


Bere al schooling completed. This 


of 5-13 years of form 


——__ ee 
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TABLE 1 


DIFFERENCE BETWEEN THE MEANS OF THE HYPER- 
TENSIVE AND ARTERIOSCLEROTIC GROUPS USING THE 


“A” SCORES 
Group N M SD SEp t 
Hypertensive 24 575 257 
0.72  1.83* 
Arteriosclerotic 23 443 234 


% p <.10. 


compares with an average of 8.8 and a standard 
deviation of 2.6 for the Arteriosclerotic group, with 
a range of 4-14 years completed. 


Method of Scoring the Aspiration Responses 


Although many systems of scoring have been used 
in various studies in the literature, the most widely 
used is the D score or the average discrepancy be- 
tween the subject’s performances and his aspirations. 
The D score, however, fails to take account of what 
would appear to be an important psychological vari- 
able: the effect of success or failure upon the sub- 
ject’s immediately subsequent aspiration. Thus, an 
average D score of +1 for one person may represent 
a very consistent aspiration to do one better than 
the last performance no matter whether it was a 
success or failure experience. For another person a 
+1 score could represent an average of quite differ- 
ent reactions following success or failure. 

Guided by the Atkinson (1957) formulation that 
the achievement oriented individual sets his goal at 
the point of greatest uncertainty of success or failure, 
and assuming that this point is slightly above his 
last previous performance, we have set up the fol- 
lowing operational definitions for the present study: 
achievement oriented aspiration—one point above 
the last previous performance; failure oriented high 
aspiration—two or more points above the last previ- 
ous performance; failure oriented low aspiration—a 
goal at or below the last previous performance. The 
performances of the subject are operationally classi- 
fied as follows: Success—reaching or surpassing one’s 
goal; Partial Failure—a performance one point be- 
low the goal set; Failure—a performance two oF 
more points below the goal set. 

From the responses, two scores are derived: (a) 
An achievement oriented score designated as the 
“A score”; this is the total number of achievement- 
oriented responses. (b) A failure oriented score des- 
ignated as the “F score”; this is the algebraic sum of 
the failure oriented responses weighted in the follow- 
ing manner: a high aspiration response following a 
Success was given a weight of 1; a high aspiration 
response following a Partial Failure was given a 
weight of 2; a high aspiration response following a 
Failure was given a weight of 3. Similarly, a low 
aspiration following a Failure was assigned a weight 
of —1; a low aspiration response following a Partial 
Failure was assigned a weight of —2, and a low 
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aspiration response following a Success was assigned 
a weight of —3. 


RESULTS 


Table 1 presents results of a é test com- 
paring the achievement oriented scores of the 
Hypertensive group with those of the non- 
hypertensive Arteriosclerotic group. The dif- 
ference between the means is not statistically 
significant, although the direction of the 
difference shows a higher mean for the Hyper- 
tensive group. 

Table 2 presents the results of a é test 
comparing failure oriented scores of the 
Hypertensive group with those of the non- 
hypertensive Arteriosclerotic group. The dif- 
ference between the means was significant 
beyond the .05 level. A mean of 0.38 was 
found for the Hypertensive group, indicating 
that this group gave high aspiration responses 
about as often as low aspiration responses. 
A mean of —7.61 was found for the non- 
hypertensive Arteriosclerotic group, indicating 
that this group predominantly gave low 
aspiration responses. 

Age has not generally been shown to be 
significantly related to level of aspiration. 
However, since it is known medically that it 
is related to arteriosclerosis, the question was 
raised as to the possibility that age had af- 
fected the results. However, analysis reveale 
identical means for the groups. The average 
age of the Arteriosclerotic group was 54.17, 
with a standard deviation of 8.43; the aver- 
age age for the Hypertensive group was 
54.13, with a standard deviation of 8.94. The 
ages ranged from 32 to 63 in the Arterio~ 
sclerotic group and from 34 to 65 in the 
Hypertensive group. The age factor, then, 15 
not significant in the present study. 


TABLE 2 


pR- 
DIFFERENCE BETWEEN THE MEANS OF THE ee 
TENSIVE AND ARTERIOSCLEROTIC Grours USING TF 


F Scores 
—— 
Group N M sp SEp t 
Hypertensive 24 +0.38 12.45 aii 2.26" 
Arteriosclerotic 23 —7.61 11.79 


** p<.05. 
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Table 3 presents a comparison of the per- 
formance scores of the two groups. The mean 
of the Arteriosclerotic group was 13.91 with 
a standard deviation of 2.61; the mean of 
the Hypertensive group was 12.88 with a 
Standard deviation of 2.04. The difference of 
1.03 in the means was not statistically 
significant. 

Table 4 presents a comparison of the de- 
Sree of improvement of the two groups as 
measured by the difference between the 
Score on the first trial and the highest score 
attained on any of the subsequent trials. 
For the Arteriosclerotic group, the mean im- 
provement was 3.48 with a standard devia- 
tion of 1.24. The mean for the Hypertensive 
group was 3.29 with a standard deviation of 
1.28. The slight difference in the mean im- 
Provement (.19) was not statistically sig- 
nificant, 


Discussion 


The intent of this study was to determine 
whether a group of hypertensive cardiac pa- 
tients would differ from a group of non- 
hypertensive arteriosclerotic cardiac patients 
On a level of aspiration test. The results show 
that when the level of aspiration is measured 
in terms of failure oriented responses, the two 
groups differ significantly with the hyper- 
tensives showing a higher level of aspiration 
response, In Atkinson’s (1957) terms there 
are two ways of responding to fear of failure: 
choosing a goal in excess of reasonable ex- 
Pectation, or choosing a goal which can be 
£asily attained. Thus, to defend against anx- 
‘ety, the failure oriented individual either 
Chooses a goal so high that failure to attain 

© goal is not destructive, or chooses a goal 
Which is low enough so that he can be sure 

achieving it. In terms of fear of failure, 
ur data, therefore, suggest that in avoid- 


TABLE 3 


PERFORMANCE SCORES OF THE HyPERTEN 
AND ARTERIOSCLEROTIC GROUPS 


SIVE 


Group N M SD  SEb t 
YPertensi 2.04 
a Sive 24 12.88 0.65 1.57 
“Mlosclerotic 23 13.91 261 


TABLE 4 


DEGREE oF IMPROVEMENT OF THE HYPER- 
TENSIVE AND ARTERIOSCLEROTIC GROUPS 


Group N M SD SEp t 
Hypertensive 24 3.29 1.28 
0.38 0.50 
Arteriosclerotic 23 3.48 1.24 


ing failure, the arteriosclerotic is more likely 
than the hypertensive to utilize the defense 
of choosing an easily attainable goal. 
Analysis of the freguency of the achieve- 
ment oriented responses fails to show a sta- 
tistically significant difference between the 


- groups. However, the significant finding is 


that when the failure oriented responses of 
both groups are analyzed, the hypertensive 
was more likely than the arteriosclerotic to 
choose a response which was so high as to 
assure failure. The difference between the 
arteriosclerotic and the hypertensive lies in 
the results of their different approaches to 
failure. The arteriosclerotic avoids the per- 
sonal feeling of failure by ei i for 
objective success, i.e., by setting Mis aspira- 
tion low. The hypertensive, by contrast, 
arranges for objective failure. Thus no mat- 
ter what the reason, the hypertensive less 
often reaches his goal—that is, he objectively 
fails. 

The results of this study lead us to specu- 
late about a possible relationship between 
hypertension and continual self-chosen fail- 
ure. Reiser, Brust, and Ferris (1951) have 
discussed the role of “life stress” in the de- 
velopment of hypertension, assuming that the 
patient’s reaction to the stress is elevated 
blood pressure. But just what, in the life 
stress of the hypertensive, would lead to the 
blood pressure elevation? Further, what in the 
life stress distinguishes the hypertensive from 
the arteriosclerotic, for the psychosomatic lit- 
erature (Miles et al., 1954) includes studies 
which find that all heart patients live under 
more psychological stress than normals. 

Our data would support the hypothesis 
that continued failure, resulting from too 
high goals, rather than a general life stress, 
can lead to hypertensive reactions. In at- 
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tempting to explain this relationship, we 
hypothesize that continued failure is in some 
way associated with a physiological with- 
drawal of blood from the arteriolar bed. The 
increased blood pressure of the hypertensive 
follows from the peripheral resistance in the 
arteriolar bed. Thus the hypertensive reaction 
is not the direct result of general psycho- 
logical stress, but rather, is the direct result 
of the hypertensive avoidance reaction to the 
specific stress of continued failure. Support 
for the notion that the hypertensive reaction, 
i.e., constriction of peripheral vessels, is not 
due to general psychological stress itself 
comes from a well-controlled study by Baker 
and Taylor (1954) which found that skin 
temperature rises, i.e., arterioles dilate, under 
general psychological stress. 

The present results also force some modi- 
fication of Dunbar’s characterization of the 
hypertensive. Hypertensives may fear falling 
short, as she suggests, but they react to this 
fear, not by choosing goals too low, as she 
suggests, but by choosing goals so high that 
they cannot possibly be achieved. Thus they 
insure failure. 

Because this is the first study, to our 
knowledge, to apply level of aspiration tech- 
niques to the study of psychological variables 
in heart disease, the results obtained suggest 


further study of cardiac patients with this 
technique. 


SUMMARY 


Twenty-four hypertensive cardiac patients 
and 23 nonhypertensive arteriosclerotic car- 
diac patients hospitalized at a Veterans Ad- 
ministration Hospital were administered a 
level of aspiration task based on the Min- 
nesota Rate of Manipulation Test. Achieve- 
ment oriented scores and failure oriented 
scores were derived from the aspiration re- 
sponses. An achievement oriented response 
was operationally defined as an aspiration 
one point above the last previous perform- 
ance; a failure oriented response was defined 
as either “high” (two or more points above 
last previous performance), or “low” (at or 
below the last previous performance). No 
significant difference between the groups was 
found in the frequency of achievement 
oriented scores. However, when failure 


and Richard M. Lundy 


oriented responses were broken down into 
high and low response patterns, the hyper- 
tensive group gave significantly (p < 05) 
less low aspiration responses than the non- 
hypertensive arteriosclerotic group. Level of 
performance and degree of improvement were 
not significantly different for the two groups. 
The hypertensive group “arranged” for re- 
peated failure by consistently setting exces- 
sively high goals. It is hypothesized that the 
withdrawal of blood from the arteriolar bed, 
resulting in increased blood pressure, is an 
avoidance reaction to the repeated and con- 
tinual failure experiences which the failure 
oriented hypertensive arranges for himself in 
a neurotic fashion. 
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OBJECTIVE ESTIMATES OF CLINICAL JUDGMENTS* 


ROLFE LAFORGE 
University oj Illinois 


One of the more promising avenues towards 
a theory of personality develops from the dis- 
covery of the kind of structure most appro- 
priate for a model of clinical inference. The 
present paper reports an early and imperfect 
example of this approach. There are at least 
four major imperfections. 

First, while one would prefer to abstract 
representative tasks of clinical inference, in 
this study only one kind of information and 
only one clinical construct were available. 
MMPI profiles of scores on the 12 K-cor- 
rected clinical scales were given to four psy- 
chologists experienced in the use of the 
MMPI. Each psychologist rated the “degree 
to which repression was relied upon by the 
patient as a defense against anxiety.” The av- 
erage of these four ratings for any patient is 
the value of the “clinically inferred” variable 
(R)? 

Second, one would like to ask the psycholo- 
gists who make the clinical inferences to de- 
scribe also the cues, working hypotheses, and 
methods of verification which they use. Here, 
each rater was asked only to list as many 
“cues” of repression as could be found in 
MMPI profiles. The four lists were pooled, 
and duplications removed; 27 cues remained, 
with many dependencies among them. 

Third, one would like to examine the rela- 
tions between data and construct, and be- 
tween construct and prediction, using the most 
general statistical models available. In this 
study only two models were applied. One 
model employed a linear prediction of R from 
its multiple regression on Hy and Sc. The 
other method sought cues which could be 


1 This work was supported by United States Pub- 
lic Health Service Postdoctoral Fellowship M3634 
(1952). 

2 For details, see: LaForge, Leary, Naboisek, Cof- 
fey, Freedman (1954, pp. 132, 135-136). 


associated with different ordinal subregions 
within the total range of R. That is, the vari- 
able R was used to order the profiles within 
a sample; then the Mann-Whitney and the 
Wald-Wolfowitz (runs) tests were applied 
separately to each of the suggested cues to 
test for relationship to R. A 5% level of sig- 
nificance was used in all tests. In the first 
sample of 35 outpatients, eight cues were Te- 
lated to R by the Mann-Whitney. Four of 
these same cues were related to R according 
to the runs test, plus two additional cues 
which had a nonmonotonic relationship with 
R. (The four found by the Mann-Whitney 
alone are presumably the result of its greater 
power against monotonic alternatives to the 
null hypothesis.) 

Fourth, and most serious of the imperfec- 
tions, cross-validation of these results on the 
second sample of 83 outpatients became 1” 
possible as a result of loss of data during 4 
6-year lapse. The investigator redefined from 
memory 26 cues whose relation to R was then 
tested in the second sample by the runs test 
alone. Of 14 “pattern” cues referring to the 
relative height of two or more scales, 11 wer 
found to relate to R; of 11 “absolute eleva- 
tion” cues, 4 related to R. One “mixed” cu 
referring to both elevation and pattern, also 
predicted R. , 

These 16 cues could be used as a checklist 
to enable the MMPI novice or researcher 1° 
make a quick estimate of R from an MMPI 
profile. To simplify the estimation process, z 
Guttman scale was constructed by eliminat- 
ing cues approximately equal in relative fre- 
quency to others. The remaining seven cues 
scaled in the second sample with a reproduc’ 
bility above .90; removal of one cue raisè 
this to .96. A check on a third sample of 207 
outpatients gave a reproducibility of .93. B® 


360 


rt 


Objective Estimates of Clinical Judgments 


cause the cues are to some extent experimen- 
tally dependent, this figure indicates only the 
sufficiency of the estimate in use. The six- 
item Guttman-type scale ° correlated as well 
with R (.799) as did the simple sum over all 
Predictive items (.788). In comparison, the 
multiple correlation of Hy and Sc with R, 
based on the same sample of 83 cases, was 
‘843. (The beta weights in this sample were, 
for K-corrected T scores: Hy, .0718; Sc, 
—.0460. Thus a quick estimate of R could 
be obtained by subtracting Sc from 1} times 
Hy.) 

On the basis of the present evidence, there 
is little to choose between the two estimates. 
Of course the lack of cross-validation makes 


® The Guttman-scaled cues are L > 60, L>F, Hs 

Hy >D + Se, Hy > Sc, K+10>F, and Hy +10 

>D, in order of increasing frequency of occurrence. 

` most predictive single cue was Hs + Hy >D 
ic. 
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comparison particularly difficult. Considera- 
tion of those cues which predicted R as 
against those which did not significantly dis- 
criminate suggests that the emphasis on con- 
figuration or pattern in the teaching of clini- 
cal inference may be justified. On the other 
hand, the equal success of a linear combina- 
tion of two scales points up the psycho- 
metrician’s faith in the efficacy of negatively 
correlated (7zysc = — .379) predictors in an 
apparently configural situation. And either ob- 
jective measure showed correlations with R 
approximately as high as the interrater cor- 
relations. 
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A NOTE ON “IMPULSE REPRESSION AND EMOTIONAL 
ADJUSTMENT” 


H. J. EYSENCK 
University of London 


In a recent paper, Grater (1960) has tested 
the Freudian theory of impulse repression as 
a correlate of neuroticism, emerging with con- 
clusions which apparently contradict that 
theory; the more neurotic subjects were, if 
anything, less “repressive” than were the non- 
neurotic ones. This result would appear to be 
in line more with Mowrer’s (1953) view of 
neurosis, according to which 


the problem-solving activity which is usually referred 
to clinically as self-protectiveness or defensiveness 
. +. functions in the interest of the primary drives 
or id, rather than, as Freud posited, in the services 


of the socially derived forces of the superego 
(p. 145). 


I have discussed the point at issue between 
orthodox Freudian writers and Mowrer else- 
where (Eysenck, 1957, p. 82f.), and have 
suggested there that the distinction which 
must be made in order to accommodate the 
known facts is one between extraverted neu- 
rotic behavior patterns (hysteria, psycho- 
pathy, hypochondriasis, etc.) and introverted 
neurotic behavior patterns (anxiety, reactive 
depression, obsessional-compulsive, dysthymic 
reactions). This personality dimension of 
extraversion-introversion is conceived of as 
being orthogonal to neuroticism, and I have 
further suggested that “impulse repression” 
and socialization generally are in part caused 
by constitutional factors closely linked with 
introversion. Dysthymic neurotics, according 
to this view, are “oversocialized,” hysteric and 
psychopathic ones “undersocialized.” This 
theory has been discussed in some detail 
in relation to the experimental evidence 
(Eysenck, 1957, 1960a, 1960c) and it may 
be concluded that it serves to reconcile a large 
amount of factual material, 

When we turn to Grater’s study we find 
that he has defined neuroticism in terms of 


three MMPI scales two of which are measures 
of extraverted neuroticism (Hy, Hs), while 
the most clear-cut introverted scale (Pt) was 
not used at all. According to the analysis 
given above, therefore, we would expect neu- 
rotics (as defined by Grater’s MMPI scores) 
to be extraverted and less given to impulse 
repression than nonneurotics. His results, as 
far as they go, bear out this prediction, al- 
though in only one or two instances do his 
scores reach statistical significance. 

The purpose of this note is not so much 
to reinterpret Grater’s data as to draw at- 
tention to the absolute necessity, in work of 
this kind, to take into account the two- 
dimensional nature of the test-space in 
which the experimenter is working (Eysenck, 
1960b). Much experimental work in this field 
is wasted because results are quite uninter- 
pretable, it being impossible from the data 
given to sort out the dimensions involved} 
work with the Manifest Anxiety scale is 4 
good example of this, the resulting score 
having loadings both on neuroticism and on 
introversion (Bendig, 1960; Eysenck, 1957). 
Much of the theoretical disputation regarding 
the nature of neuroticism is sidetracked by 
emphasizing either the extraverted or the 
introverted side (Miller & Dollard, 1950; 
Mowrer, 1953). The evidence for at least tw0 
factors in this field is now practically con- 
clusive (Eysenck, 1960b) and it would seem 
desirable to recognize this fact in the 
design and interpretation of psychological 
experiments. 
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SEX DIFFERENCES IN MENTAL HEALTH ANALYSIS 
SCORES OF ELEMENTARY PUPILS 


JOSEPH C. BLEDSOE 


University of Georgia 


The manner in which the individual per- 
ceives himself has been regarded as one indi- 
cator of the degree of mental health he dem- 
onstrates or possesses. Among the currently 
available instruments for measuring self-per- 
ception of mental health status is the Mental 
Health Analysis prepared by the California 
Test Bureau. The test manual makes no men- 
tion of sex differences in norms for the Men- 
tal Health Analysis. Previous studies (Ausubel, 
Balthazar, Rosenthal, Blackman, Schpoont, & 
Welkowitz, 1955; Davidson, Sarason, Light- 
hall, Waite, & Sarnoff, 1958; Sarason, David- 
son, Lighthall, & Waite, 1958) have indi- 

` cated that girls may perceive themselves as 
significantly more accepted and intrinsically 
valued than boys. The present study provides 
more evidence of the possibility of sex differ- 
ences in mental health status, as reflected by 
tesponses to the Mental Health Analysis. 

The subjects were 96 girls and 101 boys 
enrolled in the fourth through the seventh 
grades in an elementary school of a south- 
eastern city of approximately 35,000 popula- 
tion. The Elementary Form of the Mental 
Health Analysis was administered as a part 
of a 3-year in-service education program for 
teachers designed to promote better under- 
standing of mental health principles and prac- 
tices. 

: The Mental Health Analysis consists of 200 
items to be answered by circling Yes or No. 
Five sorts of personality “liabilities” and five 
sorts of “assets” are investigated. Liabilities 
include behavioral immaturity, emotional in- 
stability, feelings of inadequacy, physical de- 
fects, and nervous manifestations. The assets 
include close personal relationships, inter- 
personal skills, social participation, satisfying 
work and recreation, and adequate outlook 


and goals. The reliability of the Elementary 
scale is reported as: assets, .90; liabilities, 
.89; total, .90. Reliability coefficients for com- 
ponent scores of the Elementary scale vary 
from .80 for “outlook and goals” to .85 for 
“physical defects.” Content validity of the 
instrument is based upon the adequacy of 
item selection, the meaningfulness of the 
analysis of the mental health categories, and 
the cleverness in disguise of items. Studies of 
concurrent validity are reported in the test 
manual (California Test Bureau, 1959). 
Means and standard deviations for the sev- 
eral components, categories, and total scores 
on the Mental Health Analysis for boys an 
girls were computed.t Differences in ere 
were then tested for significance by the £ tes 
of significance. The .05 level of confidence 
was accepted as the criterion for rejection 0 
the null hypotheses involved. Twelve of the 
13 differences favored the girls, and 7 of thes¢ 
differences met the .05 level of significance 
criterion. Significant differences were found in 
the total score, the total assets, the total lia 
bilities, and in close personal relationship» 
outlook and goals (assets subscales), and bes 
havioral immaturity and feelings. of inath 
quacy (liabilities subscales). A nonsignifica t 
difference favoring the boys was found in tl z 
physical defects subscale of the liabilit 
category. Thus within the limitations of t F 
study, it appears that elementary school er 
tend to rate themselves significantly higher ° 


1A summary table of these statistics has been be 
posited with the American Documentation Instit 
Order Document No. 6762 from ADI Ausiliary Poy 
lications Project, Photoduplication Service, Tabe 
of Congress; Washington 25, D. C., remitting in ies: 
vance $1.25 for microfilm or $1.25 for photocey a 
Make checks payable to: Chief, Photoduplica' 
Service, Library of Congress. 
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the Mental Health Analysis than do elemen- 
tary age boys. Implications for differential 
norms may be suggested. More important 
may be implications for teacher understand- 


ing of and curriculum adjustment to the needs 
of boys. 
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A COMPARISON OF ACCEPTORS AND RESISTORS OF DRUG 
TREATMENT AS AN ADJUNCT TO PSYCHOTHERAPY * 


ALLEN RASKIN 


Veterans Administration, Washington, D. C. 


In a recent adjunct chemotherapy study with 
psychiatric outpatients, two of the major causes 
of patient attrition were refusal to take the study 
medication and excessive deviation from pre- 
scribed dosage levels. Can these patients be 
identified early in treatment and why are they 
reluctant to take the assigned medication? This 
study was undertaken as a preliminary effort to 
answer these questions by identifying variables 
which differentiate patients who remained in psy- 
chotherapy but resisted taking drugs from pa- 
tients who accepted drug treatment as an adjunct 
to psychotherapy. 

Data for the present study were collected in 22 
Veterans Administration Mental Hygiene Clinics 
as part of a larger chemotherapy study. There 
were four psychotherapy-plus-drug groups and 
one psychotherapy-only group in the larger study. 
The four study drugs were chlorpromazine, 
meprobamate, phenobarbital, and placebo. The 
study was conducted double-blind. Patients were 
told that the drugs had helped a lot of people 
with similar troubles. Study patients were all 
males, under age 50, who were acceptable for in- 
dividual psychotherapy. There were 142 patients 
(the Acceptors) who remained in psychotherapy 
at least 8 weeks and took their medication as 
prescribed. An additional 37 patients (the Re- 
sistors) also remained in psychotherapy for at 
least 8 weeks but either refused to take the as- 


+An extended report of this study may be ob- 
tained without charge from Allen Raskin (Neuro- 
psychiatric Research Laboratory, Veterans Benefits 
Office; Munitions Building; Washington 25, D. C.) 
or for a fee from the American Documentation In- 
stitute. Order Document No. 6647 from ADI Aux- 
iliary Publications Project, Photoduplication Service, 
Library of Congress; Washington 25, D. C., remit- 
ting in advance $1.25 for microfilm or $1.25 for 
photocopies. Make checks payable to: Chief, Photo- 
duplication Service, Library of Congress. 


signed medication or took significantly less than 
the amount prescribed. As Acceptors and Resis- 
tors did not differ significantly by treatment 
group, all Acceptors were pooled into one group 
and all Resistors into another. 

On the basis of data obtained from the pa- 
tients and from their therapists, the Acceptors 
and Resistors were compared on 40 personality, 
socioeconomic, and attitudinal variables. bape 
did not report a greater number of adverse si 
effects at the end of the first week on medication 
Eleven significant differences were found bann 
these two groups as compared to two expected P 
chance for the number of tests made. Compara 
to the Acceptors, the Resistors were better = 
cated, had a greater knowledge of psychiatry, 
less favorable attitudes toward physicians, ra a 
themselves more hostile on an adjective TE 
list, and admitted to greater direct a e A 
hostility. At the end of the initial page 
therapists rated Resistors as expressing 2 ied 
negative attitude toward taking the Ey 
drugs, less likely to have their psychot ern 
facilitated by the addition of a tong 
inwardly hostile, reporting more sie" ae 
overt aggression, and less likable than Acc ie 
After 8 weeks of treatment, therapists wi ie 
the Resistors as more resistive to psychot a “a 

These findings indicate that by the end be able 
initial psychotherapy session, therapists 5 atti- 
to identify many Resistors by their negati 
tude toward taking the assigned medication. oi 
Resistors’ reluctance to take the study drug h 


i njen 
drugs apparently provided a convenien a 
point for the hostile and aggressiv hitte 
noted by the Resistors’ therapists and 4 
by the patients themselves. 


(Received September 23, 1960) 
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AN INTERPRETATION OF m! 


JOHN M. REISMAN 
Rochester Child Guidance Clinic, New York 


Piotrowski (1957) hypothesized that the types 
of movement in m responses are always different 
from those of M and FM. His hypothesis was 
based on his interpretations of m as tendencies 
Which are never acted out while tendencies indi- 
cated by M and FM might be manifested in 
Overt behavior, These “tendencies” were said to 
be primarily indicated by three types of action: 
flexor, extensor, and blocked movements. Respec- 
tively, they were characterized by giving in to 
gravity, overcoming gravity, and tension; they 
Were interpreted as tendencies toward compli- 
ance, self-assertion, and indecisiveness. 

Using a sample of disturbed children whose 
mean age was 10, Reisman (1960) found results 
directly opposite to those predicted by the hy- 
Pothesis. This study repeated the test of the hy- 
Pothesis with adolescents to determine if, with in- 
creasing age, there is a tendency for a person’s m 
to differ in type from his M and FM. 

„Only responses that were identical with or 
highly similar to Piotrowski’s examples of ex- 
tensor, flexor, and blocked movements were con- 
Sidered. With this restriction, from a pool of 80 

°rschachs obtained from adolescents referred to 
à child guidance clinic, only 22 were found which 
rontained at least 1 m and 1 M. In this sample, 

re were 18 boys and 4 girls. Ages ranged from 
ee IQ scores ranged from 80-120 (X = 102). 
one of the subjects was considered psychotic. 
me but one of the subjects, who was referred for 
an terachieving in school, had been referred for 

Cts of delinquency. 

n a test for reliability, there was 90% and 

© agreement between the experimenter and 


T An extended report of this study may be ob- 
c La Without charge from John Reisman (Rochester 
Guidance Clinic; 31 Gibbs Street; Rochester 4y 

ork) or for a fee from the America - 
AD on Institute. Order Document No. 6648 from 
Serys Uxiliary Publications Project, Photaduplication 
ie ite ibrary of Congress; Washington 25, D: C; 
for 178 in advance $1.25 for microfilm or pa 
Photo Otocopies, Make checks payable to: Chief, 

Pduplication Service, Library of Congress. 


New 


two judges in categorizing 96 responses as to type. 
Disagreements did not appreciably affect the re- 
sults. Movement responses were extracted from 
their records, coded, and written verbatim in a 
random order. Three months elapsed between the 
compilation and the categorizing of responses. 

Subjects produced a scorable total of 29 ex- 
tensor, 19 flexor, and 2 blocked M responses; 30 
extensor, 6 flexor, and 0 blocked FM responses; 
and 25 extensor, 2 flexor, and 1 blocked m re- 
sponses. These results were similar to those of 
the previous study and to findings with a sample 
of “normal” adolescents. Of the 22 records, 18 
had m and M or FM responses of the same type; 
14 records had m and M responses of the same 
type. These results were, once again, in direct 
opposition to Piotrowski’s hypothesis. Further- 
more, if it had been predicted that an adolescent’s 
M or FM is of the same type as his m, this hy- 
pothesis would have been supported (x? = 7.68; 
p<.01). 

Piotrowski has stated that tendencies indicated 
by m are never acted out. In only one case, that 
of a boy whose m was extensor and who was re- 
ferred for underachieving, was this expectation 
supported by the record of overt behavior. It 
would seem reasonable to conclude that, depend- 
ing upon the adolescent’s controls, the tendencies 
indicated by m may or may not be acted out. 
The results further suggest that m be interpreted 
as an awareness of impulses or feelings which the 
individual experiences difficulty in controlling and 
expressing. Piotrowski’s types of movement seem 
to offer assistance in understanding what kinds of 
impulses or feelings cause the person concern. 


REFERENCES 


Piorrowsk1, Z. N. Perceptanalysis. New York: Mac- 


millan, 1957. 
Rersman, J. M. Types of movement in children’s 
Rorschachs. J. proj. Tech., 1960, 24, 46-48. 


(Received October 7, 1960) 


367 


n 
nal of Consulting Psychology 
sa Vol. 25, No. 4, 368 


THE RELATION OF THE FAMOUS SAYINGS TEST TO 
SELF- AND IDEAL-SELF-ADJUSTMENT* 


BERNARD I. MURSTEIN 


Interfaith Counseling Center, Portland, Oregon 


Bass (1958, p. 479) cites Thurstone as stating 
“the best form of projective test is one which is 
quite unstructured for the subject but fairly well 
structured for the examiner.” Using Thurstone’s 
words as a model, Bass attempted to construct 
a new projective technique in which administra- 
tion and scoring are completely objective. The 
subject is presented with a booklet of 100 prov- 
erbs which he answers by checking the Yes, ?, 
or No box for each proverb. From his answers 
four slightly correlated scales have been derived. 
These are Social Acquiescence, Fear of Failure, 
Conventional Mores, and Hostility. Social Acqui- 
escence has received wide publicity because of 
the role it is reputed to play in most personality 
questionnaires. Little, however, has been done to 
study the meaning of Social Acquiescence apart 
from its role as an interferer in the study of 
other variables. Accordingly, to study the per- 
sonality correlates of the Famous Sayings Test 
(FST), scores on this test were compared to a 
personality questionnaire adaptation of the 
Butler-Haigh SIO Q sort instrument. 

The hypothesis for this study was that there 
is no significant correlation between the scales 
of the FST and the following scales of the SIO 
Q sort: (a) Self-Adjustment, (b) Ideal-Self- 
Adjustment, and (c) Self—Ideal-Self-Adjustment 
Discrepancy. 

Forty-five subjects (24M, 21F) from the intro- 
ductory psychology class at the University of 
Portland were told that the experimenter wished 
to investigate the merits of a new type of ques- 
tionnaire about how people describe themselves as 


1 An extended report of this study may be obtained 
without charge from Bernard Murstein (Interfaith 
Counseling Center; 729 Southwest Alder; Portland, 
Oregon) or for a fee from the American Documenta- 
tion Institute. Order Document No. 6759 from ADI 
Auxiliary Publications Project, Photoduplication 
Service, Library of Congress; Washington 25, D. C., 
remitting in advance $1.25 for microfilm or $1.25 
for photocopies. Make checks payable to: Chief, 
Photoduplication Service, Library of Congress. 
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well as to construct a test of attitudes towards 
famous sayings. f ę 
Of 15 correlations bearing on the hypothesis, 
only the correlation between Social Acquiescence 
and Self—Ideal-Self-Adjustment Discrepancy 
proved to be significant at the .01 level (r = 45); 
The results lend support to the interpretation O 
Social Acquiescence as more than an interior 
set. Social Acquiescence may be a superficia 
phenotype reflecting considerable underlying anx- 
jety which manifests itself by an inordinate w 
to win acceptance by others through excesslV 
onformity. 
: As to a other scales—Conventional Mores, 
Hostility, and Fear of Failure—none showed eey 
relation to Self- or Ideal-Self-Adjustment. Co a) 
sidering earlier failures to find meaningful ca 
relates for these scales, it must be consi ee 
doubtful that they will eventually find a niche 2 
themselves in trait measurement. The reason ion 
this pessimism is that, in addition to rather sole 
reliabilities, these scales make the further, p a 
ably unwarranted, assumption that eed Sale: 
to a hostile proverb indicates that one 15 0 a; 
It must be remembered that proverbs eee an 
necessarily simple, bland statements that wee: 
consensual validation about their general ag 
it is entirely possible that de 
of the proverbs may be perceived g key Te 
jori nce, 
by the majority of persons. ae coals unless 
scaled values can be established as to the noa a 
tive agreement on the verification of a pro 
To a lesser degree the same criticism is ap. en 
to Social Acquiescence. The fact that it com 
56 items compared to 30 for each of the or ue 
and is not limited to a specific content aS 35 
of the others, probably accounted for its £ 
validity. 


Se 
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Since pseudohypertrophic muscular dystrophy 
(PMD) was first described there has been con- 
troversy concerning the intelligence of the chil- 
dren who have it. Several studies done recently 
claim that these children do not suffer from 


mental deterioration, and that they are of average 
Intelligence, Academic retardation has been men- 


tioned as a frequent problem, in some of these 


a 
Studies, 
x 


ese points in a more rigorous manner than has 
Was given intelligence and achievement tests. 
investigation: (a) Would the measured depression 
Socioeconomic background? (b) Does any child 
With a chronic disease show depression in intel- 
ligence? (c) Would measured depression in in- 
telligence be due partly to the factor of the child S 
aving a physically handicapping (as distinct 
rom merely chronic) illness? Appropriate control 


8roups were selected to investigate these ques- 


fions for: (a) siblings of the PMD group, (b) 


ae study was designed to clarify some of 

been done to date. A group of boys with PMD 

) Several further questions were raised in this 

In intelligence be partly a function of the child’s 
‘abetics, and (c) amyotonia congenita. 

= M of the 1937 Stanford-Binet scale was 

{ Bite as the measure of intelligence. The measures 

academic achievement were the Gilmore Oral 

fading Test and the Metropolitan Achievement 


Mt tiong in Arithmetic Fundamentals. The educa- 
nal quotient (EA/CA x 100) and the accom- 


a 3 
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THE INTELLIGENCE OF BOYS WITH MUSCULAR DYSTROPHY: 


DON KEITH WORDEN 
Western Reserve University 


plishment quotient (EA/MA X 100) were fig- 
ured. 

The muscular dystrophy group consisted of 38 
school age boys. It was possible to test 27 of 
these children’s siblings on the Binet. 

The amyotonia congenita group consisted of 16 
children. The diabetic and the diabetic-sibling 
group consisted of 36 children each. 

The PMD group had a mean IQ of 83. The 
range was from 46 to 134, but skewed radically 
to the left. 

Twenty-five of these 38 boys were given the 
achievement test in arithmetic. The mean educa- 
tional quotient (EQ) of this group was 84. The 
accomplishment quotient (AQ) was 96. The read- 
ing test was administered to 24 of these 38 boys. 
The mean educational quotient for reading was 
87 and the mean accomplishment quotient was 
101. The IQ did not differ significantly from the 
EQ in either reading or arithmetic. These three 
measures, however, differed significantly from the 
AQ in reading and the AQ in arithmetic. The two 
AQs did not differ significantly from one another. 

The siblings of the PMD group had a mean 
IQ of 110. The children with diabetes mellitus 
had a mean IQ of 107. The diabetic siblings had 
a mean IQ of 109. The patients with amyotonia 
congenita had a mean IQ of 118. The analysis 
indicated that the PMD group differed signifi- 
cantly from all control groups. 

In summary, boys with pseudohypertrophic 
muscular dystrophy functioned on an intelligence 
test significantly below average and below several 
control groups. Their IQs and EQs are both below 
average and do not differ from one another. How- 
ever their AQs indicate that they are doing all one 
could expect for children of their mental ages. 

Thus, within the confines of this study, the 
intellectual deficit seems specifically associated 
with the factor of having PMD. 
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SYMBOLIC INTERPRETATION OF RORSCHACH CONTENT? 


JOSEPH F. RYCHLAK 


St. Louis University 


This research continues the examination of 12 
Rorschach contents studied earlier (Rychlak, 
1959): Bat, Bear, Boots, Clouds, Fire, Fur, Hair, 
Island, Mask, Mountains, Rocks, and Smoke. It 
was decided that a culling of Rorschach protocols 
for these contents, and then a comparison made 
along personality dimensions for subjects per- 
ceiving certain of these contents might throw 
preliminary light on the little studied area of 
content symbolism. 

Subjects were 80 girls and 86 boys in the 11-14 
year age range. Testing was done in groups, with 
the Harrower (1959) Group Rorschach slides and 
the High School Personality Questionnaire 
(Cattell, Beloff, & Coan, 1958) used as inkblot 
and personality measures, respectively. All testing 
was completed within one week. Scoring percent- 
age of agreement between judges for identifica- 
tion of the contents Was 95; subjects’ one week 
test-retest reliabilities—based on binomial ex- 
Pansion—reached the .01 level for half of the 
contents, 

Frequently reported contents fi 
(N = 166) included Bat, 
whereas Fur, Hair, and Island 
noted. Girls reported Clouds, 
content significantly more often than boys 
(p<.05 or less). Boys were more likely than 
girls to report Mask content ($ <.01). 

The (chi square) .01 level personality findings 
for the total sample were: Subjects who see 
Boots tend to be talkative and cheerful, Children 
who report Bat content are less tense and excit- 
able than their peers. The reporting of Fur is 
related to a form of emotional instability, suggest- 


or all subjects 
Bear, and Fire; 
were infrequently 
Hair, and Smoke 
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without charge fi 


stitute. Order Document No. 6761 from 
ADI Auxiliary Publications Project, Photoduplica- 
gress; Washington 25, 
$1.75 for microfilm or 
hecks payable to: Chief, 
brary of Congress. 


370 


AND 


DONALD E. GUINOUARD 
Montana Staie College 


ing overactivity and frustration. The .05 level 
findings for total sample were: Fire content is 
reported more frequently by subjects who ate 
tense and excitable. Children who report Rocks 
content tend to be dominant, independent, a 
outgoing. Island content is suggestive of a tough, 
self-sufficient personality, S 
Considered individually by sex, the following, 
.05 level findings were noted: Boys who see 
Clouds in the inkblots are phlegmatic, deberan 
and self-effacing; girls who see Clouds are tough; } 
realistic, and group-conforming. Girls cope : 
Hair are self-concerned and individualistic. Th 
contents Bear, Smoke, Mask, and Mountains 4 
not discriminate between subjects’ personali: 
To propose to study potential symbols a 
thetically implies the existence of a nei s 
“universal” symbolism, something repugnan in 4 
the common sense of many clinicians. eh P 
broad terms, such a symbolism may be po al 
through the intervention of cultural ee 
Reference to a collective unconscious by ame 
theorists would be interpreted from this ee 
point to mean that there are variables Zs by A 
culture which—though learned by all Peli 
many—are unverbalized. They are the PERT j 
side of cultural manifestation., The clin aion 
through studies of this sort and the oie wig F 
of the proper set as a result of his stu Y: tive 
identify these variables in his client’s prol jer 
responses, dreams, etc. In this connection, resent 
esting parallels were noted between Parlier 
findings and the forced associations of the 
study (Rychlak, 1959). 


k 
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CLIENT DEPENDENCY AND 


KENNETH HELLER 
University of North Dakota 


The relationship and interactional aspects 
of Psychotherapy, while long considered im- 
.Portant, have received ever-increasing theo- 
retical and experimental emphasis in recent 
years. Bordin (1959), for example, has stated: 


The key to the influence of psychotherapy on the 
Patient is in his relationship with the therapist. 
Wherever Psychotherapy is accepted as a significant 
enterprise, this statement is so widely subscribed 
to as to become trite. Virtually all efforts to theorize 
about Psychotherapy are intended to describe and 
“xplain what attributes of the interactions between 
the therapist and the patient will account for what- 
ever behavior change results (p. 235). 


i 
j 


Previous investigations of the psychothera- 
=» Peutic relationship have frequently been at- 
ij tempts to identify and interrelate client, 
therapist, and/or transactional variables 

_ Which theoretically appear to be important 
„mensions of the therapist-client interaction. 

© je present study, continuing in this direc- 
. tion, has focused upon client-therapist at- 
traction as its major concern. This potentially 

. Significant dimension of the therapeutic rela- 
‘Onship has been defined by Libo (1957), in 

à general sense, as “the resultant of all forces 

l acting on the patient to maintain his relation- 
, 1D with the therapist.” To implement this 


“finition, he developed the Picture Impres- 
Slons Test 

i 

l 


s 


call , a projective technique designed t 
feeli forth client verbalizations concerning 
Droe toward therapists and the ffierapy 
A With client-therapist Ee e 
Strateg ‘his manner, Libo (1957) ano 
Ma e a significant relation between n 
ther tude of client attraction toward = 
enait and certain clearly observable an 
T relevant, overt, client behaviors. 
fi here is evidence to suggest that the spe- 
“Se Nature of these relationship maintaining 
p acting upon the client may include such 
371 
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THERAPIST EXPECTANCY 


AS RELATIONSHIP MAINTAINING VARIABLES 
IN PSYCHOTHERAPY 


ARNOLD P. GOLDSTEIN 
University of Pittsburgh School of Medicine 


participant characteristics as client depend- 
ency and the therapist’s expectations regard- 
ing a favorable therapeutic outcome. In a 
discussion of dependency in psychotherapy, 
Dollard and Miller (1950) note that therapy 
is often facilitated by initial client depend- 
ency. According to their formulation, the 
client brings to the therapeutic situation a 
desire to please the therapist, this desire being 
considered one of the main forces helping the 
client overcome the initial anxieties associated 
with therapy. As therapy progresses the client 
is expected to grow in independence, since he 
need no longer rely on pleasing the therapist 
as his only motivation for continuing in 
therapy. 

The viewpoint that dependent clients be- 
come more independent after the successful 
completion of psychotherapy is also examined 
by another line of investigation. Studying the 
present-self and ideal-self descriptions of psy- 
chiatric patients, Fordyce (1953) found that 
those patients who described themselves as 
dependent stated that they would ideally like 
to see themselves as being more independent. 
Since Rogers and Dymond (1954) and others 
report that successful psychotherapy produces 
an increased congruence between present-self 
and ideal-self descriptions, it seems reasonable 
to expect that, after a course of successful 
psychotherapy, clients with pretherapy de- 
pendent self-descriptions should see them- 
selves as growing in independence. 

The therapist’s expectation of patient im- 
provement, a second potentially important 
relationship maintaining variable, has been 
demonstrated by Goldstein (1960b) to affect 
significantly the amount of improvement the 
patient reports as having taken place and 
also the duration of psychotherapy. Kelley 
(1949), Rosenthal (1959), and Ulenhuth, 
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Canter, Neustadt, and Payson (1959) have 
also demonstrated the potency of participant 
expectancies in two-person interactions. 


Hypotheses 


1. Client pretherapy attraction to the psy- 
chotherapist varies: (a) positively with client 
pretherapy dependency, and (b) positively 
with client over-therapy movement toward 
independence. 

2. Client pretherapy attraction to the psy- 
chotherapist varies positively with the latter’s 
expectation of client improvement. 


METHOD 


Two treatment conditions were utilized in the 
present investigation: therapy (experimental group), 
and no-therapy (control group). Thirty clients and 
10 therapists participated. Most of the clients were 
undergraduates in attendance at the Pennsylvania 
State University who had sought psychotherapy 
at the University Psychological Clinic. Clients were 
randomly assigned to the two treatment conditions 
and the 15 clients in the experimental group were 
then randomly assigned to therapists. Each therapist 
met with his client(s) two times per week for indi- 
vidual, 50-minute sessions. The 15 control clients 
were placed on a waiting list and did not participate 


in formal psychotherapy during the 15 session dura- 
tion of the investigation, 


The 10 thera 
completed their 
chotherapy, incl 
therapists had 


y t their approach would vary 
according to the client and the situation. The 


therapists were employed by either the Psychological 
Clinic or the Division of Counseling at Pennsylvania 
State University and usually saw clients as part of 
their regular clinical case load. 


Measurement o Í Variables 


Client-therapist attraction, 
client-therapist attraction was 
as the client’s Score on the Picture Impressions Test, 
This Projective technique consists of four cards 
depicting therapy-like situations to which the client 
is requested to respond in a manner analogous to 
TAT administration, Content analysis scoring (Libo, 
1956) (e.g, Locomotion, Barriers to Locomotion, 
Satisfaction, etc.) was carried out independently by 
the authors with complete agreement occurring on 
83% of the client stories. For each client, an at- 


As suggested above, 
operationally defined 


Kenneth Heller and Arnold P. Goldstein 


traction score was then determined by summing his 
scores for each of his four stories. 

Dependent behavior. Dependent behavior _ was 
conceived of as the extent to which an individual 
prefers to have others prevent his frustration or 
punishment and provide need satisfaction (Fitz- 
gerald, 1958). In order to narrow the definition of 
dependency even further, it was decided to concen- 
trate on two aspects of dependent behavior described 
by Murray (1943) as Succorance and Deference. 
Measurement of dependent behavior occurred at two 
levels: 

Self-descriptive dependency. To measure the extent 
to which clients attributed dependent behavior to 
themselves, the Succorance, Deference, and Auton- 
omy scales of the Edwards Personal Preference 
Schedule (EPPS) (Edwards, 1954) were adminis- 
tered. Following the suggestion of several researchers 
(Bernardin & Jessor, 1957; Gisvold, 1958; Zucker- 
man & Grosz, 1958) a total self-descriptive depend- 
ency score was computed by summing the scores 
from the Succorance and Deference scales and sub- 
tracting from the sum the score from the Autonomy 
scale. 

Overt dependency. A Situational Test of Depend- 
ency developed earlier by one of the authors 
(Heller, 1959) from a modification by Borgatta 
(1951) of the Rosenzweig P-F study (1947) was 
used in the current investigation. Borgatta developed 
a role-playing form of the P-F study in which the 
original paper and pencil situations were acted out 
by both examiner and examinee. Borgatta’s evidence 
suggests that subjects react to the role-playing form 
of this test in a manner quite similar to the way 
in which they react to real, overt, threatening situa- 
tions, The role-playing situations were further moda 
fied so that all the situations involved a degree 0 
threat to the respondent. An additional modification 
was development of a forced-choice rather than an 
open-ended method of responding. x f 

Therapist expectation. Therapist expectation © 
client personality change was generally defined as 
the feelings held by the therapist relating tO the 
anticipated nature and intensity of his client’s per- 
sonality problems upon completion of the latters 
psychotherapy. Operationally, this variable was de- 
fined (Goldstein, 1960b) as the difference betw 
the therapist’s ordering of personality problem @ sor 
when he is instructed to sort them under two on 
ferent orientations: (a) according to the status e 
which he, the therapist, expects his client’s prob, 
to be upon completion of Psychotherapy; an his 
according to the manner in which he views i 
client's problems at the time of sorting, i.» 
present perception of his client. itua- 

The Picture Impressions, EPPS, and the Si RS 
tional Test of Dependency were individually adra 
istered to all clients immediately prior to their heir 
therapy session and immediately following y 
fifteenth session. If a therapy client dropped t; 
therapy before his fifteenth session, he was Pa 
tested” at the time of dropout. When this occur or 
a control client who had been in the wait group 


A jent 
the same period of time as the experimental d 


Client Dependency and Therapist Expectancy 


had been in therapy, was also tested. The therapists 
completed their sortings after every 5 sessions for 
15 sessions. 


RESÙLTS AND DISCUSSION 


The correlations, for both experimental and 
control clients, between pretherapy attraction 
and the dependency scores obtained pre- and 
Dosttherapy, as well as the resultant depend- 
ency difference score, are presented in Table 1. 
_ The pretherapy correlations in Table 1 
indicate a significant relationship between 
client’s attraction and both self-descriptive 
and overt dependency. Those individuals who 
Wrote stories to the Picture Impression cards 
Indicating that they anticipated positive 
Stratification from therapy, described them- 
Selves before therapy as more dependent ac- 
cording to the EPPS, and also acted more 

pendently on the Situational Test of De- 
pendency, This finding lends support to the 
Contention of Dollard and Miller that initial 
client dependence can act in ways that main- 
vain the early stages of the psychotherapeutic 
relationship, i 

, The hypothesized relationship between posi- 
tive attraction to therapy and movement 
toward self-descriptive independence over 
€rapy is also supported. Those clients who 
àre positively attracted see themselves as be- 
coming more independent as therapy pro- 
Stesses, Of additional interest is the fact that 
à distinction can be made between self- 
be Sctiptive and behavioral changes toward 
pendence, While the attracted clients saw 
\emselves as becoming more independent, 
4 relationship was not found on the overt 
A oral measure. Behaviorally, the at- 
still e clients in the experimental group were 
testi pendent at the time of the posttherapy 
iera &, ie., after 15 sessions of psycho- 
be R Y. It appears that attracted cliens may 

their to see themselves as changing, alt oug 
tive, MtetPersonal interactions remain tele 
ch er Constant. The motivation for this 
Well $ in self-descriptive dependency may 

thus the desire to please the therapist an 
or „SAY what they think the therapist expects 
Sam Ould like them to say. Working with a 
trate of psychotherapists who received mar 
Dist, Ng at the same university as the he 
Guti i! the present study, Peterson, Snyder, 
'e, and Ray (1958) have demonstrated 
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TABLE 1 


CoRRELATIONS BETWEEN CLIENT PRETHERAPY 
ATTRACTION AND DEPENDENCY 


Group 
Pretherapy Attraction All 
with: Subjects Experimental Control 
Pretherapy 
EPPS dependency 5011+ — = 
situational dependency 91 — = 
Pre-post difference 
EPPS dependency —= —.7740k —.508* 
situational dependency — —.189 -002 
Posttherapy 
EPPS dependency ma -065 —.009 
situational dependency = 5420" 241 
ata obtained at this time preceded the assignment of 


to experimental and control groups, hence prether- 
apy r's were calculated across all 30 subjects. 
* Significant at .06 level. 
** Significant at .05 level. 
** Significant at .01 level. 


that therapists of this training background 
show a great deal of attention to their clients 
when the latters’ remarks demonstrate a pref- 
erence to let others provide for the satisfac- 
tion of their needs. Should this differential 
between self-descriptive and overt behavioral 
test findings be corroborated in further re- 
search, serious doubt would be cast upon the 
common procedure of evaluating the effective- 
ness of psychotherapy by the exclusive use of 
self-descriptive measures. 

Somewhat more puzzling is that the control 
clients, who received no formal psycho- 
therapy, showed almost the same relation be- 
tween attraction and self-descriptive move- 
ment toward independence. It should be noted 
that the control group as a whole showed no 
movement over therapy, either in the inde- 
pendent or the dependent direction. But still, 
when individual variation is considered, those 
in the control group who described themselves 
as becoming more independent over time were 
positively attracted toward therapy, while 
those who described themselves as becoming 
dependent tended to be those who were nega- 
tively attracted. The investigators can only 
speculate concerning the reasons for this rela- 
tionship in the control group. Our present 
inclination is to view attracted clients as indi- 
viduals who would interpret even minimal 
clinic contact (such as is involved in testing 
sessions) as benefiting them in some way. 
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TABLE 2 


CoRRELATIONS BETWEEN CLIENT ATTRACTION AND 
THERAPIST EXPECTANCY 


Variables correlated 7 


Preattraction and: 


TE;* 427 

TE 144 

TEs 199 
Difference-attraction and: 

TEs -619* 

TE. —.137 

TEs 418 
Postattraction and: 

TEs 535* 

TE. —.162 

TEs -096 


a Fifth session therapist expectancy, 

b Tenth session therapist expectancy. 

e Fifteenth session therapist expectancy. 
* Significant at .05 level, 


A study by Barron and Leary (1955) ap- 
pears to offer support for this contention. 
They state, with regard to wait-list control 
clients: 
simply having committed oneself to participating 
in psychotherapy, and having had a reciprocal com- 
mitment from a clinic to afford psychotherapy, even 
though not immediately, represents a breaking of 
the neurotic circle, A force for change has already 
been introduced, In addition, the initial interview 
and the psychological testing may themselves be 
psychotherapeutic events, since during such sessions 
the patient makes some efforts to confront himself 
and his problems more objectively than he has in 
the past (p. 244). 


A recent investigation by Goldstein (1960a) 
supports this finding. 

Table 2 presents the correlations between 
therapist expectancy of client improvement, 
as obtained at five session intervals, and client 
pre- and posttherapy attraction, as well as the 
change in attraction over the course of 
therapy. : 

These findings indicate a significant rela- 
tion between the expectation of client im- 
provement held by the therapist early in 
therapy, and both the change in client at- 
traction over the course of therapy and the 
magnitude of client attraction subsequent to 
the fifteenth session. None of the other cor- 
relations presented in Table 2 reached ac- 
cepted levels of significance. 


Kenneth Heller and Arnold P. Goldstein 


In addition to offering partial support for 
the hypothesis that therapist expectancy is a 
relationship maintaining aspect of the psycho- 
therapeutic interaction, these findings raise 
the question as to why this should only be the 
case with regard to the therapist’s early 
expectations (fifth session), and not his tenth 
and fifteenth session anticipations of client 
improvement. A study by Good (1952) fur- i 
nishes a basis for differentiating “early” and 
“late” therapist expectations, a differentiation 
which appears to shed light on the present 
study’s findings. He states: < 


Support was found for the hypothesis that in a 
relatively novel situation, generalization effects 
chiefly determine the expectancy held by S, and that 
as S has more experience with the specific task, 
expectancies develop which are a function of this 
task (p. 99). 


In the present study, the expectations held 
by the therapist at the fifth session regarding 
client personality change may be more a 
function of his perceived success and failure 
with past clients than his feelings concerning 
his present client’s progress. By the tenth 
session, however, their psychotherapeutic in- 
teraction is less “novel” and the major 
determiner of the therapists’ expectations may 
have shifted from generalization effects to task 
effects. Kelly (1955), Lennard and Bernstein 
(1960), and Rotter (1954) have also noted 
significant temporal shifts in therapist ex- 
pectancies over the course of psychotherapy. 

The basis for the failure of late-therapy » 
therapist expectations to be relationship 
maintaining would appear to be an important 
question for further research. 


SUMMARY A 


The present investigation attempted tO 
determine the extent to which client depend- 
ency and therapist expectation of client im- 
provement can be considered relationship 
maintaining variables in the psychother: ~ 
peutic interaction. Thirty clients undergoing 
psychotherapy at a University Psychologic?, * 
Clinic were randomly assigned to “therapy, 
and “no-therapy” conditions and the 5 
“therapy” clients were randomly assigne s 
10 therapists. The therapists, for the yey 
part, were advanced graduate students m 
ical psychology at the Pennsylvania St 


University, Testing, on measures developed or 
modified for the current study, took place pre- 
and posttherapy for all clients and after every 
five sessions for the therapists. Results of the 
Study indicated a strong positive relation be- 
tween client pretherapy attraction to the 
| therapist and: (a) both client self-descriptive 
and behavioral dependency before therapy 
and, (b) client self-descriptive, but not be- 
vioral, movement toward independence over 
the course of therapy. A similar but less 
Fie relationship occurred in control group 
‘ents, offering further evidence for the 
therapeutic nature of such nonspecific clinic 
icp as the intake interview and psycho- 
gical testing, 
i Unexpected finding was the relatively 
ies degree of relation between pretherapy 
action in therapy clients and their overt 
N Sne terapy dependency, a finding at variance 
ment that obtained on self-descriptive instru- 
e In addition to other implications, this 
Hills s suggests caution in interpreting re- 
ba in psychotherapy research which are 
ased solely on one level of measurement. 
= “nally, partial support was obtained for 
i A Ypothesis that favorable therapist ex- 
o on of client improvement can function 
naintain the therapeutic relationship. 
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In a previous paper (Cartwright, 1957), 
it was proposed that one of the common 
themes of psychotherapy with psychoneurotic 
patients is a search for a stable identity. In 
that paper the author put the problem in 
these terms: 


If selves or roles are thought of as characteristics 
which are particular to specific interactions or 
classes of interaction, then the area of self is that 
core of characteristics which is common over N 
situations. In these terms the pre-therapy client can 
be thought of as one whose self is very small. He 
seems to have diverse selves in relation to others 
with whom he interacts, but the common core, the 
essential me is so restricted in scope as to leave him 
puzzled by the question “Who am I?” 


It was hypothesized that the self the pre- 
therapy client describes would differ con- 
siderably depending on his interactional 
referent, while the posttherapy client’s self- 
description would have more stability in 
terms of a larger core of characteristics which 
remain consistent in emphasis, regardless of 
the particular other involved. Of course it is 
most likely that the healthy individual will 
retain some variability, that there is an 
optimal point here short of rigidity. 

Specifically, the 1957 study hypothesized 
that, if asked to select three people of major 
importance to them and to describe them- 
selves on a Q sort as they are with each of 
these people in turn, a group of clients would 
show more variability among these sortings 


1 The data for this study were collected under a 
research grant from the Ford Foundation to the 
University of Chicago Counseling Center. 

2 The author wishes to thank John L. Vogel, now 
of the Department of Psychiatry, University of 
Washington, Seattle, who did all of the testing of 


the subjects. 


before therapy began than after therapy had * 
been completed; successful clients would 
change more than failure clients, and clients 
before therapy would have more variability 
in their self-descriptions than a group of con- 
trol subjects, but would not differ from 
controls after therapy had been concluded. 

Although the 1957 study supported the 
hypothesis outlined above, there were several 
reasons why it was felt that the study should 
be replicated and extended. (a) The sample 
was a very small one: 10 experimental sub- 
jects (5 successful and 5 unsuccessful cases) 4 
and 10 controls. (b) The controls were only 
tested at one point in time, as it was as- % 
sumed that they would not change in self- 
consistency. This should be experimentally 
established. (c) The relations among these 
various self-to-other sortings were not tested. 
For these reasons the present study was 
undertaken. 


METHOD 
Sample 


The present sample consists of 19 experimental 
subjects 3 and 20 controls. Originally 30 subjects wg č 
applied to the University of Chicago Counseling 
Center for psychotherapy were asked to participate 
in this research study. Of these, eight discontinue 
therapy before completing the minimum number © 
interviews, which was set at six, An additions! Me 
cases of those completing thera) were os! 
failure to make po with them for posttheraP¥ 


testing. the 
The control sample was selected to match d Cal 


il 


¥ ibuti a 
experimental sample in age, sex distribution, ape 
student and nonstudent status, As they were jected 


controls for the variable therapy, they were ea 
as never having had, not now having, an¢ © n 
D A aon Pii 
3 This sample is identical to that reporte 
the paper by Cartwright and Vogel (1960). 


376 


Psychotherapy and Self-Consistency 


no immediate intention of having psychotherapy. 
Table 1 compares the samples in the two studies. 


Procedure 


: The procedure differed slightly from that used 
in the previous study in that the subjects were first 
asked to complete a “plain” self-sort. The 100 item 
Butler and Haigh Q cards (1954) were again used 
as the primary instrument. Each subject was first 
A instructed to sort the cards to describe himself as 
e is today from those items which are most like 
tie to those which are least like him according to 
i required 1, 4, 11, 21, 26, 21, 11, 4, 1 distribu- 
tie Next a TAT was administered. Following this 
Ree a cedure was identical to that previously re- 
ed. The instructions to the subjects were: 


‘ T would like you to think of three people who are 
Very important to you. They can be of any age 
or relation to you like father, mother, child, 
end, or boss. They can be people you like or 
dislike, or some of each. You don’t have to pick 
aa any basis except they be people who are very 
‘Mportant to you and with whom you have real 
Interaction, ` 


Bis Subject was then given a code sheet on which 

ee code names for his choices opposite their 

ie eee to him, He was then given a deck of 

hem and Haigh Q cards and instructed to sort 

to th to describe himself as he is in his relationship 
© first person on his list. 


to 
rol 


; S Sort these cards to describe yourself as you 
ce yourself to be in your relationship to 
, from those cards which are least like 
as you are with him (her) to those that are 


like you in your relations with him (en): 
'OrK- 


I am with 


you 
Most 
fag a question in mind while you are W 

3 I were only the person 
+ what would I be like? 
anog Wing this sorting, the subject was given 
again, Set of the same cards and asked to sort 
Person © describe himself as he is with the second 
time > ON his list. Then this was repeated a third 

= him, Or the third person of major importance a 
in all then, there were four sortings of the 


Bi TABLE 1 
ARISON op qe EXPERIMENTAL AND CON 


~n Ters IN THE 1957 AND 1959 STUDY 


Number of 


TROL 


N Sex Age interviews 
am lé 
k ye N M F Mean Range Mean Range 
210s 
7 ca 
Epis; 1 7 3 25 agar 26 «FO 
eso 12 7 3 26 19-44 pes 
coy o 10 275) tee gees 
0 10 10 293 19-53 
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TABLE 2 


COMPARISON OF THE MEAN [TEM VARIANCES FOR THE 
1957 AND 1959 Srupies 


Sample Pretherapy Posttherapy 
E 1957 -969 .766** 

E 1959 -939 -763* 

C 1957 -736 Not tested 
C 1959 -763 -566**** 
S 1957 -849 .546** 

S 1959 -940 .580*** 

F 1957 1.088 -936 

F 1959 -938 928 


+*+ Significant at .001 level. 


same Q items using the same distribution: one self- 
sort and three of his conception of himself in three 
different relationships. 

For the experimental subjects these procedures 
were repeated when they had completed their 
therapy. For 10 of the control subjects they were 
repeated after a period of 6 months (to match the 
mean of 18 weeks of therapy). 


ANALYSIS AND RESULTS 


To test the major hypothesis of increased 
self-consistency following psychotherapy, the 
same method reported previously was em- 
ployed. Briefly the mean item variance over 
the three self-in-relationship sortings was 
computed by a method developed by Cart- 
wright (1956a). The hypothesized differences 
between these means were then tested using #. 
The ¢ between pre- and posttherapy mean 
for the experimental (E) group was 2.100, 
significant at the .05 level.* 

The ¢ between the first and second testing 
of the control (C) group was 4.394, signifi- 
cant at the .001 level. 

From these results it would appear that 
a second testing of either experimental 
(therapy) or control (no therapy) subjects 
will show a significant increase in consistency 
of sorting. 

In the 1957 study, five successful therapy 
cases were found to decrease their mean 
item variance significantly while five failure 
cases did not. Using the same criterion of 
successful therapy for the present sample 


4 All significance levels are for two-tailed tests 
unless otherwise stated. 


378 


TABLE 3 


Mean Q ADJUSTMENT Scores OF SELF-SORT OF 
CONTROL, Success, AND FAILURE GROUPS 


Direction of 
change in ad- 
justment and 

self-consistency 


Q Adjustment 


score 

Group Test 1 Test 2 Same Different 
Control 49.3 50.5 5 5 
Success 36.7 49.4**** 9 0 
Failure 36.1 38.3 6 4 


+ Significant at .001 level. 


yielded only 3 cases of success out of the 19. 
This criterion (based on therapists’ ratings 
of adjustment, change in adjustment, and 
success of the therapy) was, therefore, re- 
luctantly abandoned for one that would split 
the group more equally. The new criterion 
was based on change in test performance 
from pre- and posttherapy time. The 19 ex- 
perimental subjects were ranked on the ex- 
tent of their improvement on two test scores: 
the Q Adjustment score (Dymond, 1954a) 
and a mental health rating of the TAT 
(Dymond, 1954b). The 9 subjects who im- 
proved most on this combined ranking of 
change were called the success (S) group and 
the 10 who ranked lowest were called failures 
(F). Table 2 compares the mean item vari- 
ances for the groups and subgroups involved 
in the original and in this replication study. 
It is clear from Table 2 that failures as a 
group do not increase in self-consistency on 
a second testing. 

While the extent of the replication of the 
mean scores in the two studies is striking, 
particularly since such small groups were in- 
volved, the meaning of the change for the 
therapy subjects is obscured by the highly 
significant change in the group which was not 
in treatment. Testing for the significance of 
the difference between the changes in the S 
group and the C group yielded a nonsignifi- 
cant ¢ (1.644). However, inspection of the 
array of change scores showed that the range 
of changes was considerably greater for the 
S group than for the Cs. An F ratio was 
computed and found to be 3.30, significant 
at the .05 level. A comparison of the disper- 
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sion of the changes between the total experi- 
mental group and the C group showed the 
E group's dispersion to be much wider, of 
course. The F ratio here was 13.19, signifi- 
cant beyond the .001 level. These findings 
of the significant difference in the variances 
of changes between E and C and between S 
subgroup and C is similar to the finding 
reported by Cartwright (1956b) on his re- 
analysis of the data reported by Barron and 
Leary (1955). He concluded that clearly 
there was more change with regular formally 
defined therapy than without it, but in both 
directions. Similarly, in this study it can be 
argued that although both E and C groups 
changed significantly in the direction of in- 
creased self-consistency over similar time 
periods, those in therapy showed wider 
changes. It seems that therapy has more 
impact to change persons on this measure 
than the unknown influences which brought 
about the change in the no-therapy group. 

Further light is shed on the meaning O 
these changes by looking at the Q Adjust- 
ment scores of the plain self-sort, If the 
change toward greater consistency is to be 
evaluated positively, the more stable sel 
should be one the subject can live with more 
happily. Therefore it should be accompanie 
by a higher Adjustment score of the seit- 
description. This is not a necessary relation- 
ship. It is possible for the self-in-relationshiP 
to other sortings to become less varied with- 
out an accompanying change towards bette? 
adjustment. This is what appears to happe” 
in the control cases. 

Several important points emerge f 
inspection of Table 3. 

1. The C group does not improv a 
justment despite the fact that it does 1n 
crease in self-consistency. The improveme” 
in the Adjustment score of the E group T 
a whole is significant beyond the .01 leves; 
in the S subgroup at the .001 level. 

2. In only 5 of the 10 C cases do t 
measures move in the same direction, W n 
the ratio is 13 of 19 for the E group: fiy 
9 out of 9 for the S subgroup. It seems ue 
then that the increase in self-consistencY jy, 
the C cases must be interpreted differe ing 
perhaps due to less motivation for mat, 
fine distinctions on the second test. +9 
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structions certainly set the subjects to look 
for differences in their ways of viewing them- 
selves in their interactions. If the subject is 
largely consistent in his self-picture over 
Various interactional settings, the instructions 
might cue him to make small distinctions 
which are relatively unimportant to him, and 
SO not maintained on Test 2. If this explana- 
tion is tenable, it would be expected that the 
group would differ from the S group in 
various ways. (a) They would have fewer 
items with large discrepancies on Test 1 than 
subjects, and more small discrepancy items. 
(b) They would show less proportional change 
Mm their large discrepancy items than the S 
Sroup. (c) The increased consistency of the 
Subjects would not represent any major 
Change in self-definition. Items which are 
Consistent on Test 2 which were not previ- 
ously consistent will have changed less in 
their position of importance to the C subjects 
an to the S cases. 

„Tò test these notions, the extent of the 
“iscrepancy over the three sortings for each 
item was tabulated. Items with a discrepancy 
of 0-2 Q points were called Small Difference 
items. Items with discrepancies of 3 or more 

Points were called Big Difference items. As 
€se predictions were all directional in na- 
ure, one-tailed tests were used. All were 
confirmed at the .05 level or better. 
Coking at the new consistent items for 
© two groups, it was found that the S group 
F their new self items from a different 
urce than did the C group. If items sorted 
© extreme scale positions (0,1,2 = least 
som me; and 6,7,8 = most like me) are ant 
tha €d to be more important to the oe 
cn those sorted at the middle of the cet 
Brest distribution, an item which was on 
in /OUsly an important self item es) 
Co; j cement and consistent ie Pi ite 
revi xtre 
ave been previously € able 4 


Con: . 
Sistent or previously central. 


OWS the i ition of the new ex- 
: tion 0 
tre previous positi he & aod C 


8toups Consistent items for t 


Self ‘thout doubt the new stable elements of i 
ther, Ch are of importance to the ge 
ite by People are significantly more = = 
the;, Which have changed with respec 
“importance. It is more often the case 
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TABLE 4 
Previous POSITION or New EXTREME 
Consistent ITEMS 
Success group Control group 
Central Extreme Central Extreme 

il 0 3 9 

9 3 2 6 
14 S 2 7 
12 3 5 4 

8 2 2 6 

4 2 0 2 

8 6 2 6 

8 1 3 10 

7 2 5 5 

Mean Mean 
9.0 2.6 27 5.9 


Note.—Succe: 
Contr 


*** Significant at .01 level or beyond. 


that these items have gained an importance 
over the therapy period, rather than that 
they were important previously and have only 
gained in stability or consistency. For the 
controls, the reverse is true. The new stable 
items at the extremes of the sorts on Test 2 
were significantly more often also at the ex- 
tremes on Test 1, although at that time in- 
consistently sorted. Here, then, is another 
indication that the increased consistency of 
these two groups represents different proc- 
esses. The successful therapy cases have 
changed their definitions of themselves in 
their important relationships, while the con- 
trols seem only to have consolidated the 
definitions previously made. 

If this proposed explanation holds, that 
the increased consistency of the two groups 
is essentially different, that therapy has an 
impact which shifts the meaning of the self 
as well as exercising a consolidating influ- 
ence, then the particular selves that the S 
group describes should reflect the shift in 
meaning in their Adjustment scores in their 
relation to others. If the C group’s con- 
sistency is due only to a lack of motivation 
for making fine distinctions, then no essential 
change of the adjustment of self in relation 
to others would be expected. Since each 
subject was free to choose the relationships 
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TABLE 5 


MEAN ADJUSTMENT Scores FOR SELF-IN-RELATIONSHIP TO MOTHER, ATHER, AND 


OTHERS OF SUCCESS, FAILURE, AND CONTROL GROUPS 


Mother Father Other 
Group Test 1 Test 2 Test 1 Test 2 Test 1 Test 2 
Success 38.3 49.8 46.0 53.2 44.4 52.1 
Failure 32.2 37.2 39.4 36.2 43.3 34.8 
Control 52.7 54.8 42.3 45.5 46.9 47.9 


Note.—Success: Mother Testl-Mother Test 2 
Others Test 1-Others Test 2 
Failure: Others Test 1-Others Test 2 
Mother Test 1-Others Test 1 
Control: Mother Test 1-Father Test 1 
Mother Test 2-Father Test 2 
Mother: Success Test 1-Control Test 1 
Failure Test 1-Control Test 1 
Success Test 2-Failure Test 2 
Failure Test 2-Control Test 2 
Success Test 2-Failure Test 2 
Failure Test 2-Control Test 2 
Success Test 2-Failure Test 2 


* Significant at .05 level. 
** Significant at .02 level. 
** Significant at .01 level. 


Father: 
Others: 


of importance to him, it is hard to make 
strict comparisons. For this reason relation- 
ships were only categorized as father, mother, 
and others. Table 5 gives the mean Adjust- 
ment scores for these categories at the two 
testing points. 

Some of the significant relations were: 
(a) S cases improve in the adjustment of 
the relationship to their mothers between 
Test 1 and Test 2. Neither F cases nor C 
cases show any significant change. (b) Both 
E groups are more poorly adjusted in their 
relationship to their mothers on Test 1 than 
C cases. S cases are not significantly different 
after therapy from the adjustment level of 
the C group on their second test. S cases 
border on being significantly higher in ad- 
justment on their second test than F cases 
who are still significantly poorer in mean 
adjustment than Cs. (c) There is no sig- 
nificant difference between the adjustment 
level of self-in-relationship to mother and 
father for either E group at either time. 
However, the self-in-relationship to father of 
the C group is significantly lower than their 
self-in-relationship to mother on Test 1, and 
this borders on being a significant relation- 
ship on Test 2 as well. (d) None of the 
groups changes significantly in the adjust- 
ment of the self-in-relationship to father. The 
F group is lowest of the three in mean ad- 


justment on Test 1, and this drops on Test 2 
to be significantly lower than the S group: 
(e) S cases improve their mean adjustment 
score in relation to others significantly over 
the therapy time, whereas F cases change 
significantly towards poorer adjustment. On 
Test 2 Fs are significantly lower in adjust 
ment to others than both Ss and Cs. 

It would appear from these results that 
patients, as distinct from controls, come to 
therapy with disturbance in their self-to- 
mother relationship. Patients who subse- 
quently do not succeed in therapy start out 
with poor adjustment of their self-image!” 
relation to both parents while those WPO 
succeed start with strength from their self-to- 
father image. 7 

Since it seemed likely that some interesting 
conclusions could be drawn from these resu", 
concerning the kinds of problem relationship 
that persons bring to therapy and those W be 
do and do not get resolved with therapy 9 
this kind, it seemed important to check the 
data as far as possible with those fro fie 
previous study. This is difficult since 4 da 
ferent criterion was employed and the Ce 
number of cases was only 10. For this ae 
reason, the comparison was made f 
total E group. 

The first study shows some 0 
patterns as found in the presen 


or 


f the same 
t samp” 
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TABLE 6 


Comparison or MEAN ADJUSTMENT Scores FOR 1957 
AND 1959 STUDIES 


Mother Father Other 
Sample Testi Test2 Testi Test2 Test1 Test2 
E1957 34 428 308 410 37.0 437 
E 1959 35:0 430 43.4 46.0 438 80 
C1957 36.6 38.0 47.2 
C1959 527 S48 423 45.5 466 47.2 


(a) The mean adjustment of the C group’s 
self-in-relation to father is again lower than 
their mean adjustment of self-in-relation to 
mother, although this does not reach sta- 
tistical significance. (b) Patients in the 1957 
study were also lower in mean adjustment 
of self-in-relation to mother than were con- 
trols. Speaking loosely then, both studies 
show that clients come to therapy with 
mother problems which are not present in 
Control subjects. Control subjects may have 
father problems, but these do not seem to 
be sufficiently disturbing to bring them to 
therapy. The good adjustment of the self-in- 
relation to the mother seems to be the dis- 
tinguishing mark between patients and those 
who are not patients. However, Á 
thought that perhaps the sex of the subjects 
might be an important variable in these 
relationships. Table 7 gives the mean adjust- 
ment scores for the two sexes in the current 
Study, 

Obviously from Table 7 it 
“ents of both sexes have pr 


is clear that 
oblems of ad- 


it was . 
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justment of the self-in-relation to their 
mothers, and this is not true for either sex 
group of control subjects. Also, it is obvious 
that of the clients, it is the males that have 
father problems that bring them ‘to therapy. 
Further, therapy of this kind improves the 
adjustment of the self-in-relationship to the 
mother for both sexes, but not significantly 
for males, and there is little change in the 
father relationship for either sex. 


Summary AND DISCUSSION 


This paper has reported a study designed 
to replicate and extend a previous study 
of the effects of psychotherapy on self- 
consistency. In both experiments the samples 
consisted of psychoneurotic subjects who had 
applied to the University of Chicago Counsel- 
ing Center for client centered psychotherapy. 
Both studies employed matched control sub- 
jects who were not motivated for therapy. 
The original study found the therapy group 
to have lower self-consistency than controls 
before therapy, and to have increased their 
consistency at the completion of therapy to 
the level of the control group. 

The replication study confirmed the find- 
ings of the first, but found that controls also 
increase their self-consistency on Test 2 sig- 
nificantly over their Test 1 level. An analy- 
sis of the kinds of change involved showed 
different processes at work. The increased 
consistency of the controls did not represent 
much redefinition of the self-in-relationship 


TABLE 7 


Conparison OF MEAN Apyus 


MENT SCORES OF MALES AND FEMALES 


Mother Father Other 
Mo 
Test 1 Test 2 Test 1 Test 2 
Sample Test 1 Test 2 e 
E 195 
2 c 
38.7 41.4 42.5 41.2 
Males (av = 9) 31.8 oe 50.0 52.8 45.9 45.1 
emales (N = 10) 37.8 45.8 
c 1959a as 
44.9 50. 
Males (V = 19) 50.1 FAN 146 
males (y = 10) 51.3 


= 1.473 
Note. Males: Mother Test irea ie Het: ae 
: st 1- o 
sango cases lea; Moller aa Test 1 are reported here 
"8nificant at .01 level. 
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to the important others but rather a con- 
solidation of the definitions previously made. 
In addition, there was no accompanying 
change in the adjustment level of the self 
for the control group. In the successful 
therapy group the new self-consistent items 
showed significantly more shift in importance 
in the various relationships so that it could 
be stated that redefinition was taking place. 
Moreover, the overall adjustment level of 
the self-sort improved significantly. 

Looking at the particular interactional 
referents, it was found that clients were more 
poorly adjusted in their self-to-mother image 
than controls, and this relationship held for 
both studies and for the experimental group 
when stratified both by outcome of therapy 
(success and failure), and by sex (males and 
females). This appears to be a reliable dis- 
tinguishing characteristic of subjects who 
apply for psychotherapy (at least of this 
kind), and who stick with it beyond six 
interviews. 

Control cases in the present study were 
significantly poorer in adjustment of their 
self-image in relation to their fathers than to 
their mothers, and this bordered on being a 
significant difference for the original sample 
of controls as well. It seems likely, then, that 
it may be “normal” to have a more poorly 
adjusted image of oneself in relation to father 
than to mother, but providing the self-to- 
mother image is sufficiently well adjusted, 
there is no internal pressure to seek psycho- 
therapy. This finding held equally for both 
sexes of control subjects in the present study. 
Unfortunately, the W in the previous study 
was too small to test it for that group. 

In the present study, males were found to 
enter therapy with significantly poorer ad- 
justment of self-in-relationship to father than 
did females. Successful cases improved the 
adjustment of their self-sort in relation to 
their mother, but there was no improvement 
of adjustment of the self-sort in relation to 
the father for any group in this study. This 
type of therapy, then, seems to be effective 
in resolving mother problems, but does not 
appear to change the self-to-father image. 


Rosalind Dymond Cartwright 


Perhaps clients with mother problems self- 
select client centered therapy as meeting their 
needs of working out their problems of ad- 
justment to a maternal figure. The therapist 
in this type of therapy is supposed to be 
warm, accepting, empathic, understanding, 
nondirective and to stress the feeling aspects 
of the communications: to play, essentially, 
a very feminine role. Or perhaps most 
psychoneurotic outpatients would be found 
to have disturbance in the mother area and 
some, namely, males, in the father area as 
well. But those who are in a more directive 
type of therapy might be found to have a 
reversed pattern of changes: improvement in 
the father area and little change in the 
mother area. These questions remain open for 
further research evidence. 
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SEXUAL SYMBOLIC RESPONSE IN PREPUBESCENT 
AND PUBESCENT CHILDREN 


AUSTIN JONES + 
University of Pittsburgh 


The Freudian hypothesis that pointed, 
elongated objects are symbolic of the penis 
and that rounded or enclosing objects are 
Symbolic of the vagina has received inferential 
experimental support in a recent study (Jones, 
1956). Adult subjects were asked, essentially, 
to ascribe masculine or feminine personality 
to each of a series of simple geometric figures. 

he figures were of two classes: pointed or 
clongated, and rounded or enclosing. Subjects 
responded in a manner consistent with the 
Freudian hypothesis significantly more often 
(< :001) than would be expected by chance 
alone, Male subjects were found to respond 
Significantly more consistently with the hy- 
Pothesis than females. In discussing the latter 
finding, it was noted that available evidence 

<insey, Pomeroy, Martin, & Gebhard, 1953) 
Suggests that the sexual response of men 1s 
More readily conditioned to a wide variety of 
“sual stimuli than is that of women. The 
*YMbols employed in the study were modifica- 
tions of those used earlier by Levy (1954), 
Who failed to find support for the Freudian 

Ypothesis in an experiment involving the 
matching of symbols and male and female 
8lven names, and the paired associate learning 
Of like-sex and unlike-sex pairs of symbols and 
names, The subjects of that study were fifth 
Stade children, most of whom were presum- 
a Prepubescent. In light of the subsequent 
Positive findings with adult subjects, 1t ap- 


TN Possible that age differences in a 
ae i might account partly 
n nae e lier study. 


€ negative results of the ear i 
of opmental studies of sexual senile 
totan experimental nature are apparen ly 

ally lacking, 
Ruane author wishes to thank William —_ 
: oodman, and Melvin Manis for helpful co! 


m 
mts ang advice. 


The purpose of the present study was to 
assess the strength of the sexual symbolic 
response in children of various ages up to 
the time, roughly, of pubescence. The attitude 
of the study was exploratory; it is possible to 
formulate rather contradictory hypotheses as 
to the function relating age and strength of 
response. One view would hold that, symbolic 
behavior being generally a matter of socially 
acquired learning, the strength of the sexual 
symbolic response ought to increase with in- 
creasing age and increasing exposure to the 
socially determined symbols. Also, symbolic 
behavior generally may be conceived of as 
evidence of socially imposed inhibitions of 
more direct expression; the young child, 
having lived fewer years under the inhibiting 
influence of civilization, would be expected to 
show less use of symbolism and relatively 
greater directness of sexual expression than 
would the older child or adult. A different 
hypothesis emerges when we consider the data 
regarding drive level and generalization phe- 
nomena. As drive increases, the height of 
the generalization gradient is raised and dis- 
crimination correspondingly reduced. Beach 
(1942), for example, increased the sex drive 
of a group of male rats by injections of 
testosterone, with the result that the animals 
made sexual responses to a markedly increased 
range of stimuli—thus failing to discriminate 
what function “normally” as sexual and non- 
sexual stimuli. Similarly, we would expect 
that children approaching pubescence, i.e., 
experiencing an increase in sex drive, would 
demonstrate a decrement in discrimination 
between male and female sexual cues. Their 
responses to symbols would be less consistent 
with the Freudian formulation than those of 
children somewhat younger. The following 
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experiment was designed to clarify the issues 
involved in these contrasting hypotheses. 


METHOD 
Subjects 


Since the intent of the experiment was to study the 
sexual symbolic response as a function of degree of 
sexual maturation, a strictly chronological arrange- 
ment of subject groups which includes both sexes is 
less meaningful than a categorization which takes 
into account the different maturation schedules of 
males and females. Due to the onset of puberty ap- 
proximately 2 years earlier in females than in males, 
roughly equivalent levels of sexual maturation are 
obtained by grouping the males of a given age with 
females 2 years younger. Four such “sexual matura- 
tion” groups were constituted using boys and girls 
in a public school system. The “youngest” sexual 
maturation group consisted of second grade girls and 
fourth grade boys; the next two groups, fourth grade 
girls plus sixth grade boys, and sixth grade girls plus 
eighth grade boys. The final group was planned to be 
eighth grade girls and tenth grade boys; but adminis- 
trative permission could not be obtained for the 
utilization of high school (and hence tenth grade) 
students. Ninth grade boys were available, however, 
so that the final sexual maturation group was ap- 
proximated by pairing eighth grade girls and ninth 
grade boys. Ten boys and 10 girls were selected 
randomly at each of the four levels, making a total 
of 80 subjects. The mean ages of the five grades 
from which subjects were drawn (grades two, four, 
six, eight, nine) were 8.5, 10.5, 12.3, 14.4, and NRE 
respectively. Thus the subjects of the first sexual 
maturation level are approximately four years pre- 
pubescent, while at the fourth level most subjects 
of both sexes have attained puberty. 


Materials 


The stimuli in this experiment were the 10 sexual 
symbolic figures adapted by Jones (1956) from those 
used initially by Levy (1954). Five are pointed or 
elongated (“male”) and five are rounded or en- 
closing (“female”). The figures are essentially 
“abstract” or without specific object identity; they are 
variations of circles, pyramids, rods, cubes, etc. The 
figures are reproduced in black ink on individual 
white cards. In the prior study supporting the 
Freudian view with adult subjects, one of the pre- 
sumed female symbols was shown not to be so. The 
figure was included in the present administration in 
order to make the procedures of the two studies 
comparable for clarity of discussion, although re- 
sponses to the nonfunctional figure did not enter into 
the score variable. 


Procedure 

The subjects were seen individually by the author. 
The experimental procedure was second in a series 
of other experimental procedures, allowing time for 
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the establishment of optimal rapport. Subjects were 
presented the following instructions orally. They are 
the instructions employed in the prior study with 
adult subjects except for editing to lower the 
vocabulary level. 


You know, lots of times things kind of seem like 
people to us—kind of remind us of people some- 
how. Many things around the house, for instance, 
or almost anything you know real well. Some 
things seem more like men and some things seem 
more like women. I have a set of cards here which 
have designs or pictures on them. I’m going to 
show them to you one at a time—and you tell 
me which they remind you of more—men or 
women. Just use your imagination and tell me 
the first answer you think of. Don’t stop to think 
about it; just tell me the first thing you think of. 


Following the instructions, the experimenter pre- 
sented the 10 figures in an order randomized for 
each subject. The subjects were required to respond 
within approximately 2 seconds, 


RESULTS 


The data of the experiment are presented 
graphically in Figure 1. The percentage of 
responses plotted there is the pecentage (out 
of five male symbols or out of four female 
symbols) which were consistent with the 
Freudian hypothesis. The data are plotted 
separately for male and female subjects and 
for their response to the two types of symbols, 
male and female. Data for the normal adults 
of the prior study (Jones, 1956) are plotted 
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with the Freudian hypothesis as a function o 
maturation level. (Indicated by the grade scale a7 
justed so as to equate maturation level for the two 
sexes. The two points plotted at the extreme 
represent the means for the college subjects © 
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for comparison. Table 1 summarizes an analy- 
sis of variance of trends (Grant, 1956) over 
the four sexual maturation levels. The arc- 
sine transformation for proportions was em- 
ployed (Walker & Lev, 1953). Differences 
between maturation levels (linear component) 
attained the .06 level of significance, and dif- 
ferences between symbol types the .01 level. 
Differences between subject sex groups failed 
to Teach significance, as was true of the 
various interactions. 

An additional analysis of variance was per- 
formed which compared the pooled child data 
of the present study with the adult data of 
the prior one, This analysis, summarized in 
Table 2, was thus based upon two age classes 
and, as before, two sex classes and two symbol 
type classes, Scores again were the arc-sine 
transformations of the proportions of “cor- 
rect” responses. Significant differences were 
demonstrated between the two age groups and 
between symbol types (p < -01 in each case). 

he Age x Sex interaction approached con- 
ventional significance levels ($ < 08). Al 
Other interactions were nonsignificant. 


TABLE 1 
Anatysis OF VARIANCE OF PROPORTIONS OF “CorRECT” 
Responses FoR Each SYMBOL TYPE 


= Source df MS I 
Between maturation levels 3) 7708 215 
inear 1 13,489 3.82* 
quadratic 1 Giger 
See 1 7,657 217 
Between sexes T 5000 4% 
S 
ker X maturation level 3 oe 
etween subjects within groups 72 3,082 
tie symbol types d 87i il 
X pa type X maturation level 3 744 
X mbol type X sex 1 21 
yn 
nbol type X maturation 
li evel X sex IE on 
linear 1 3,024 2.70 
Gaited 288 2.94 
Cubic = i ‘ 
o0] 
= Symbol type X subjects 
ithin groups Dus 
N 5 
Pronertigg Data in the form of are-sine transformation for 
wef S.06, 
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TABLE 2 


ANALYSIS OF VARIANCE COMPARING THE Cuitp Data 
OF THE PRESENT STUDY WITH THE ADULT Data oF 
THE Previous STUDY 


Source df MS F 

Between ages 1 72,904 21.44*** 
Between sexes 1 320 

Age X sex 1 10,682 3.14** 
Between subjects within 96 3,400 

groups 

Between symbol types 1 6083 4.90*** 
Symbol type X groups 3 3271. 26s 
Pooled symbol type X subjects 96 1,241 


within groups 


Note.—Data in the form of arc-sine transformation of pro- 
portions of Rails as responses for each symbol type. 
> 


wet S01. 


DISCUSSION 


The finding of principle interest has to do 
with the differences between maturation levels, 
the linear component of which was significant 
at the .06 level. An appropriate procedure, 
when findings fall very slightly short of the 
accepted .05 level, would be to replicate the 
experiment within the same population so as 
to provide a more definite evaluation when it 
is reported. Such a replication would have 
been desirable here but was not possible for 
administrative and public relations reasons. 
Consequently, the finding will be discussed 
tentatively here as significant, with the obvi- 
ous comment that additional assessments of 
the relationship are needed. 

The overall downward trend in sexual 
symbolic response over the four maturation 
levels of the present study supports the hy- 
pothesis that increased sexual drive as puberty 
approaches is accompanied by a raising of 
generalization gradients and an attendant 
decrement in discrimination between symbols 
of “maleness” and “femaleness.” The four 
maturation levels extend, roughly, from 4 
years prior to average age of pubescence, to 
1 or 2 years beyond it. Sexual symbolic re- 
sponse during that period decreased from 
66% to 52% (approximately chance expect- 
ancy). This trend was a highly consistent 
one, occurring in the responses of both male 
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and female subjects to both male and female 
symbols. Although it is not possible to reject 
entirely the hypothesis that socially acquired 
learnings tend gradually to instill the sexual 
symbolic response, it appears that their effect, 
if any, is obscured in the years approaching 
puberty by the relationship between increased 
drive and lowered discrimination. (The curi- 
ous upward trend in the data of the third 
maturation level [approximately age 12 for 
girls, 14 for boys; see Figure 1] fails to reach 
conventional significance levels as tested by 
the cubic component of variance associated 
with maturation levels.) 

The finding of significant differences be- 
tween symbol types, with male symbols being 
responded to “correctly” more often than 
female symbols, confirms a trend in the data 
of the previous study (Jones, 1956). Whether 
or not this is a phenomenon of some generality 
remains unclear, since no attempt was made 
to equate the particular male and female 
symbols used for the degree of stimulus 
generalization which they represented. It is 
entirely conceivable that another set of al- 
leged sex symbols might lead to a contrary 
finding. What is suggested, possibly, is that 
it is easier to draw a male sex symbol than 
a female sex symbol. 

Comparison of the child data of the present 
study with the adult data of the previous one 
(Table 2) showed a significant increase in 
sexual symbolic response at young adult years 
over the highest frequency obtained in any 
of the prepubescent years. The frequency of 
response was 82% for the adult group (mostly 
in their early twenties) as compared with a 
high of 66% for children approximately 4 
years prepubescent. Although sex drive is 
generally believed to decrease slightly between 
adolescence and the early twenties, it does 
not seem plausible to regard the associated 
lowering of generalization gradients as the sole 
cause of the striking increase in symbolic 
response. A more plausible but clearly con- 
jectural explanation would be that following 
puberty individuals undergo a period of 
socially enforced discrimination training with 
respect to sexual cues and symbols, a degree 
of discrimination training not present in the 
prepubescent years. Thus, the overall develop- 
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mental history of the symbolic response, it is 
suggested, involves two phases with, most 
probably, different determinants. For at least 
4 years prior to puberty, there is a continu- 
ous decrease in sexual symbolic response that 
is interpreted as a result of decreased dis- 
crimination attendant upon increased drive. 
By the early twenties, however, the frequency 
of sexual symbolic response has exceeded its 
prepubescent maximum, presumably as a 
function of the increased discrimination train- 
ing implicit in the highly focused social con- 
trol of adolescent heterosexual behavior. The 
verification of such an explanatory principle 
rests, of course, upon empirical research yet 
to be carried out. 


SUMMARY 


The purpose of the present study was to 
assess the strength of the sexual symbolic re- 
sponse in children of various ages through 
pubescence. Sexual symbolic response was de- 
fined as the designation of pointed, elongate 
figures as male and of rounded, enclosing 
figures as female. The figures employed were 
relatively abstract geometric forms taken from 
earlier studies of Jones and of Levy. 

The mean frequency of sexual symbolic 
response was 66% for children approximately 
4 years prepubescent. By the age of puberty 
or a year or 2 beyond, however, response had 
dropped sharply to about chance expectancy: 
The finding was interpreted as an effect of 
decreased discrimination associated with 
heightened sexual drive as puberty is aP- 
proached. The data were compared with the 
performance of adults on the same task 1°- 
ported in a prior study. By the age of about 
22, the frequency of the sexual symbolic 1°- 
sponse (82%) was shown to climb beyo™ 
the prepubescent maximum, perhaps 45. 
function of the increased discrimination trait 
ing implicit in the highly focused soc? 
control of adolescent heterosexual behavi0” 
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RELATIONSHIPS BETWEEN 1960 STANFORD-BINET, 
1937 STANFORD-BINET, WISC, RAVEN, 
AND DRAW-A-MAN 


BETSY WORTH ESTES, MARY ELLEN CURTIN, ROBERT A. DrBURGER, 
anp CHARLOTTE DENNY? 


University of Kentucky 


The latest revision of the Stanford-Binet 
(S-B) test of intelligence was published in 
January 1960; therefore, it was thought ad- 
visable to compare the IQs of a group of white 
American children on the 1960 test with IQs 
made by the same group on the 1937 S-B and 
on the Wechsler Intelligence Scale for Chil- 
dren (WISC). In addition, Raven Progressive 
Matrices (1938 & 1947) and Goodenough 
Draw-A-Man (D-A-M) IQs for part of the 
group are compared with the two S-Bs and 
the WISC. 

The group consists of pupils attending the 
University of Kentucky School, grades one 
through eight. Parents’ socioeconomic status 
is above average; fathers’ occupations are 
managerial and professional. Tested intelli- 
gence of the pupils is likewise above average; 
mean IQ on the 1960 S-B equals 123; the 
range is 84 to 159. The selection criterion was 
the availability of scores for the 1937 S-B 
L and M forms and WISC full scale. All pu- 
pils meeting this criterion were included in 
this study, making a total of 82 for the major 
comparisons, 47 boys and 35 girls. 

The 1960 S-B was administered by the au- 
thors. All other tests were administered over 
a 4-year period by graduate students in psy- 
chology enrolled in an intelligence testing 
course. All of the tests were checked, rescored, 
and supervised by the senior author. 


RESULTS 


There were from one to four scores avail- 
able for each test; therefore, means were used, 
when available, for greater reliability. The 


1 Denny is at the College of Nursing. 
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1937 S-B score was a composite of the L and 
M scores. 

Comparisons are reported in the form of 
group means and product-moment correlation 
coefficients. 

The 1937 S-B scores were converted accord- 
ing to the equation provided in the 1960 S-B 
manual (Terman & Merrill, 1960, p. 339). 
This equation provides corrections for the 
mean and standard deviation which are not 
precisely equal for all chronological age groups 
for the 1937 S-B. The converted IQ scores 
were generally lower than the unconverte 
scores, mean deviations by intelligence leve 
varying from two to four points. Overall, how- 
ever, differences for means, standard devia- 
tions, and correlations were small and gener- 
ally inconsistent and, therefore, comparisons 
involving the 1937 S-B are based on uncon- 
verted scores. 


WISC, 1937 S-B, 1960 S-B 


Age. Littell (1960), in a review of wise 
studies, concluded that children who 2% 
younger and rank higher on the 1937 SB 
tend to score higher on the 1937 S-B than cA 
the WISC. Table 1 shows no differential °° 
crepancies in the present study due to chron 
logical age. The null effect of age has ag 
been found by Weider, Noller, and Schra 
(1951), Gehman and Matyas (1955): ‘a 
Schacter and Apgar (1958); hence, i ms 
pears that the evidence for the age factor ad 
inconclusive. When the 1960 S-B is comP aid 
with the 1937 S-B and with the WISC: °% 
age is not found to be a factor. 


mi 


HAN 


Relationships among S-B, WISC, Raven, and Draw-a-Man 389 
TABLE 1 
1Q Scores 
Group N 1960 S-B 1937 minus 1960 1937 S-B 1937 minus WISC WISC 1960 minus WISC 
Average 12 100 2 102 i 101 -1 
igh average 19 115 -1 114 3 111 4 
uperior 34 126 2 128 95 + 1.24 119 7> + 1.92 
very superior 17 138 8b + 1.93 146 16" + 1.89 130 8> + 2,08 
CAG-10 to 10-0 33° 123 1 124 8 sie g 
oe 10-0 to 14-1 34° 122 4 126 9 117 5 
“ntire group 82 123 2 125 7 118 5 


Note.—Intelligence levels based on 1937 S-B scores. 
Discrepancy = 0, p < .00002. 

v Discrepancy = 0, p <.002, 

Aner he N for the CA groups is less ti 

ification during the 4-year testing period. 


Intelligence, Littell’s conclusion regarding 
the effect of IQ level on WISC-1937 S-B 
test discrepancies is supported in the pres- 
ent study, Two-tailed ¢ tests were made when 
the discrepancies were greater than the five 
Points usually allowed as test-retest error. For 
the two superior groups, the superiority of 
the 1937 S-B over the WISC is significant, 
Ê < 00002. This finding applies likewise to 
the 1960 S-B superiority over the WISC, ? 
S .002. For average groups, WISC scores are 
Comparable to both the 1937 and 1960 S-B 
si Discrepancies between both S-B tests 
and the WISC are greater for the superior 
els than for the average levels, p < .00002 
fo, the 1937 S-B and the WISC and p = .02 
oa the 1960 S-B and the WISC. Error terms 
Ach the 1937 S-B and the 1960 S-B compari- 
1964 are 1.61 and 2.13, respectively. 1937 and 
Ve 0 S-B scores are comparable except at the 
X Ty superior level where the discrepancy 15 
hoe highly significant, p < .002. Cronbach 
abe p. 171) reports, “IQs on the two scales 
; oe strictly comparable.” f 
eae group. The major correlations for the 
Co: ire group are presented in Table 2. Inter- 
relations between the 1960 S-B, the 1937 
fro, wd the WISC do not differ significantly 
a sa each other. Although the population 1S 
vac one and the size of the sample is rela- 
193 Y small, the agreement found between the 
With S-B and the WISC compares favorably 
Can ihe correlations reported for white Ameri- 
80s children by Littell (1960), 1€- in the 
ang’ The correlation between the 1937 S-B 
the WISC tends to be higher than that 


han for the entire group because some children changed from the lower to the higher CA 


between the 1960 S-B and the WISC. This 
might be expected for two reasons: (a) the 
test interval is greater for the 1960 S-B and 
WISC administrations and (0) the 1937 S-B 
scores are based on two to four tests and, 
hence, may be more reliable than the 1960 
S-B scores which are based on one test. The 
correlation between the 1960 and 1937 S-B 
scores reaches a respectable validity figure. 
The 1937 L and M correlation compares fa- 
vorably with that reported by Terman and 
Merrill (1937, p. 47) for the 1937 S-B stand- 


ardization group. 


Raven and Draw-A-Man 


As a trend, the Raven Progressive Matrices 
is superior to the D-A-M in predicting the 
1960 S-B, the 1937 S-B, and the WISC 
(Table 3). Differences between the Raven and 
D-A-M are not significant, however. While the 
Raven and D-A-M scores account for a sig- 
nificant amount of the variance of the three 
major tests, the magnitude of these correla- 
tions is relatively small, and, consequently, 


TABLE 2 
Major CORRELATIONS 


1937 S-B, 
Test 1960 S-B WISC Form L 
1937 S-B 82 -80—.80’s 
WISC -14 
1937 S-B, M 92 — 9i; 


Note.—N = 82; all > 0, p <.005; representative finding 
on right. 
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TABLE 3 
RAVEN AND Draw-A-Man CORRELATIONS 
Test 1960 S-B 1937 S-B WISC 
D-A-M 43 .46—.41 43 
Raven .59 .671—.54 55—.91 
75 
Note.—N = 72; all > 0, p < .005; representative findings 
on right. 


individual predictability is low. The D-A-M 
and 1937 S-B correlation compares favorably 
with that found by McHugh (1945) for pub- 
lic school kindergarten children but is much 
lower than Goodenough’s (1926, p. 51) 
D-A-M and 1916 S-B correlation of .74. The 
Raven and 1937 S-B correlation is higher than 
that found by Banks and Sinha (1951) for 
London school children, whereas the Raven 
and WISC correlation is much lower than the 
corresponding correlations of .91 and .75 
found by Martin and Wiechers (1954) and 
Barratt (1956), respectively, for American 
school children. It is difficult to evaluate com- 
parisons involving the Raven and D-A-M tests 
since there are so few representative studies 
in the literature. 


Discussion 


The primary purpose of this report is to 
compare the 1960 S-B with its predecessor, 
the 1937 S-B, and another major children’s 
intelligence test, the WISC. The 1960 S-B 
was found to correlate equally well with the 
1937 S-B and the WISC, these correlations 
being comparable to representative findings 
for similar relationships. While the 1960 S-B 
was found to predict the 1937 S-B quite well 
for a fairly heterogeneous group, prediction 
was not equal for subgroups classified accord- 
ing to 1937 S-B intelligence levels. Agreement 
was found for average and superior levels but 
not for the very superior level. This partial 
disagreement of the two scales supports Cron- 
bach’s (1960) doubt that they are strictly 
comparable but further investigation with re- 
gard to intelligence level is indicated due to 
the relatively small size of the samples re- 
ported here. , 

Intelligence level was likewise found to be a 
significant factor preventing high agreement 


Estes, Curtin, DeBurger, and Denny 


between both S-B tests and the WISC, a fact 
well supported by comparisons of the 1937 
S-B and the WISC. Implications from find- 
ings of the present study are that at average 
levels of intelligence WISC scores may be 
used interchangeably with scores from both 
S-B tests. This is not true for the superior 
levels where the obtained significant discrep- 
ancies of 7 to 16 points may easily place the 
scores in different intelligence classifications 
resulting in somewhat spurious or doubtful 
interpretations. However, it would be unwise 
to set definite limits for test comparability 
according to intelligence level on the basis 
of existing information which is not in com- 
plete agreement or strictly comparable. More 
information is needed drawn from larger and 
more representative samples. This informa- 
tion is no doubt available if existing test data 
were classified and analyzed. 


SUMMARY 


1. The comparability of IQs from five dif- 
ferent intelligence tests was investigated for 
an above average group of white American 
children. 

2. For the entire group (N = 82), scores 
for the 1960 S-B, the 1937 S-B, and the WISC 
were found to be comparable and to compare 
favorably with representative similar findings- 

3. The age factor, contrary to some previ- 
ous findings, was not found to account for test 
discrepancies among the two S-B and WISC 
instruments. 

4. Intelligence level, in agreement with 
previous findings, was a factor producing 
highly significant discrepancies at superior 
levels among the two S-B and WISC instru- 
ments. More investigation is needed regarding 
the effect of intelligence level on test com- 
parability. 

5. Correlations relating the Raven and 
D-A-M to the two S-Bs and WISC were S187 
nificant but relatively small. 
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EGO IDENTITY, ROLE VARIABILITY, AND ADJUSTMENT? 


JACK BLOCK 


University of California, Berkeley 


The meaning of the concept of “ego iden- 
tity” is still evolving. In Erickson’s perceptive 
essays on adolescent adjustment and behavior 
(1950, 1956), the construct is discussed and 
explored so richly that an easy and clearly 
sufficient operational translation of the no- 
tion cannot be formulated. A core meaning of 
the concept, however, is perhaps to be found 
in Erickson’s (1950) statement that “the 
sense of ego identity is the (individual’s) 
accrued confidence that (his) inner sameness 
and continuity are matched by the sameness 
and continuity of (his) meaning for others 
Biel <) (p 228), 

In this definition, three elements are pres- 
ent. First, an individual must perceive himself 
as having “inner sameness and continuity,” 
i.e., he must, over time, presume himself to be 
essentially the same person he has been. Sec- 
ond, the surrounding persons in one’s social 
milieu must perceive a “sameness and conti- 
nuity” in the individual also. And finally, the 
individual must have “accrued confidence” in 
a correspondence between the two lines of 
continuity, internal and external. His percep- 
tion of the person he sees himself as being 
must be validated by feedback from his inter- 
personal experiences. 

From this definition, one aspect of ego 
identity may be singled out for study, the 
dimension we have labeled role variability 
(RV). The meaning of role variability is per- 
haps most readily indicated by describing its 
extremes. At one end of this dimension, there 
is “role diffusion,” where an individual is an 
interpersonal chameleon, with no inner core 
of identity, fitfully reacting in all ways to all 
people. This kind of person is highly variable 
in his behaviors and is plagued by self-doubts 


1 This investigation was supported by Research 
Grant M-1078 from the National Institute of Mental 
Health of the United States Public Health Service. 


and despairs for he has no internal reference 
which can affirm his continuity and self-in- 
tegrity. At the other extreme, there is what 
might be called “role rigidity,” where an in- 
dividual behaves uniformly in all situations, 
disregarding the different responsibilities dif- 
ferent circumstances may impose. Here the 
core of identity is hollow, based not on a 
genuine and unquestioned sense of personal 
integrity but rather upon deep seated fear of 
any amount of self-abandon, Somewhere in 
between, presumably, a proper balance can be 
struck in the struggle both for identity and 
the capacity for intimacy. 

In terms of the preceding definition of ego 
identity, role variability may exist at the level 
of self-evaluation and separately, at the level 
of observations by others. It is the relation 
between these two levels of role variability 
that specifies ego identity or a problem of ego 
identity. In the present study, the focus has 
been limited to “sameness and continuity” as 
perceived and evaluated by the participant, 
and its significance for adjustment. 

Some years ago, the writer reported a study 
of the way in which an individual’s role be- 
haviors changed as a function of various in- 
teractional contexts (Block, 1952). A single 
subject was studied by having her system- 
atically describe her interactions with a set 0 
“relevant others.” These descriptions were 
then factor analyzed—an instance of O-tech- 
nique (Cattell, 1946)—and it was observe 
that the factor dimensions appeared to orde" 
and to summarize, in a cogent way, the seY- 
eral kinds of roles this subject manifested. i 

In this earlier study, the focal subject’ 
viewed herself quite differently, dependiné 
upon the person with whom she was Ord 
fronted, and yet she still presented a St 4 
stantial core of interpersonal consistency: 7°” 
though seeing herself as changing from 1e 
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tionship to relationship, a general factor of 
Some consequence proved to underlie all her 
interactions, 

During the course of evaluation of this and 
other findings, the general question was raised, 
“How much . . . interpersonal consistency— 
l.e., interpersonal sameness—is socially appro- 
priate and, in terms of the individual’s in- 
ternal psychic economy, consonant with his 
need systems?” (Block, 1952, p. 285). In 
Erickson’s terms, role diffusion is a person- 
ally untenable situation for the individual. He 
Is beset by too many despairs and self-contra- 
dictions as a consequence of his extreme be- 
havioral fluctuations. Role rigidity, where an 
Individual is not affected in his behaviors as 
a function of the persons with whom he is in- 
teracting, may be an indefinitely prolonged 
adaptation of sorts, but certainly is not an 
Optimal interpersonal solution. It prohibits 
further growth and development of the indi- 
vidual; it is effective as a security mechanism 
Only ‘so long as the interpersonal surround is 
a tolerant one. From this frame of reference 
follows the simple hypothesis the present 
Study endeavored to test: “the amount of 
Interpersonal consistency is curvilinearly re- 
lated to the degree of maladjustment, as de- 
fined independently” (Block, 1952, p. 285). 


METHOD 


Index of Role Variability 


A Set of 20 adjectives selected a priori as reflecting 
Various fundamental facets of interpersonal behavior 
Served as the basic descriptive device. Each subject 
ranked this set of 20 adjectives eight times, so as to 
characterize his own behavior while with each of 
= t specified “relevant others.” The 20 adieclives 
a Oyed were listed in the following order: relaxe: ly 

‘mal, indifferent, warm, independent, witty, co 
ingrative, assertive, indecisive, distractible, oo 
ae masculine (or feminine if rater it 
Rone? Wise, unselfish, trusting, Tee ects 
Wee le, responsive, and protective. ig Pa ne 
tep tSKed to describe via this ranking of adjectis 
fol nique their behavior and relationships PaA i 
ate ving eight individuals: someone in k 0! don’t 
Care JUally interested, an acquaintance AA with 
equiyntch about, your employer or SOMe Seure) 
of valent status, a child, a parent (or paren 5 Ae 
oth © same sex, a parent (or parent figure) s ane 
Quai Sex, a close friend of the same ee em 
ha nte whom you would like to mow a a 

© eight specified other persons were selecte 


Samp): r $ A 
™Dling widely, if not exhaustively, the interper 


sonal possibilities of the subjects employed. The test 
materials were arranged in booklet form, the de- 
scriptions being recorded on separate pages. 

For each subject, the eight adjective rankings were 
intercorrelated by the Spearman rank-difference cor- 
relation method and the resulting 8 X 8 correlation 
matrix factor analyzed by the Thurstone centroid 
method. For each matrix, the percentage of the total 
communal variance explained by the first unrotated 
factor was calculated. The first unrotated factor of 
a matrix reflects the degree of congruence among the 
set of variables being studied. For the present data, 
the first unrotated factor indicates the extent of in- 
terpersonal consistency a subject views himself as 
manifesting over the set of relevant others specified. 
By dividing the mean first factor loading (squared) 
by the average communality of the matrix, an index 
of role variability is derived which is comparable 
from matrix to matrix, and hence from individual to 
individual. When high, this score suggests that the 
individual involved views himself as essentially the 
same person in his several interactions—he is inter- 
personally consistent; when low, the subject has de- 
scribed himself as rather different from situation to 
situation—he is interpersonally changeable.? This in- 
terpersonal consistency score has a potential range of 
0 to 100. 


Measurement of Maladjustment 


Each subject also responded to the 480-item Cali- 
fornia Psychological Inventory (CPI) (Gough, 1957). 
The CPI, by virtue of the conceptual scheme which 
has guided it from its inception, employs a pool of 
items which are understandable by and inoffensive 
to nonpsychiatric populations and it was for this 
reason this inventory was used, even though the es- 
tablished scales of the CPI provide no direct meas- 
ure of maladjustment. Although a knowledgeable in- 
terpreter can readily discern maladjustment from the 
nature of a subject’s CPI profile, for our present 
purposes a more direct CPI measure of maladjust- 
ment was desired. 

As part of the continuing program of scale de- 
velopment at the Institute of Personality Assessment 
and Research, it proved possible several years ago to 
establish, refine, and validate a personality scale to 
index an individual’s “susceptibility to anxiety.” This 
scale, labeled Psychoneuroticism (Px), was devel- 
oped by the sequential application of cluster analysis, 
item analysis against cluster score criteria, a further 
method of dimensional purification, and finally, vali- 
dation on a number of different subject samples. An- 
other paper will describe in some detail the develop- 
ment of the Pn scale and several other new scales 


2 Often, extracting the first factor from each sub- 
ject’s interperson matrix will prove to be too endless 
a job of calculation for the individual researcher, In 
such instances, Kendall's coefficient of concordance, 
computed for each subject, may be substituted as a 
quick and largely adequate index of interpersonal 


consistency. 
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with unusual properties but for the present report a 
brief description of the Px scale is in order. 

The Pa scale contains a total of 45 items® of 
which 33 are contained in the MMPI item pool and 
19 in the CPI collection of items (the apparent in- 
consistency in arithmetic here is due to the inclusion 
in the CPI of a number of MMPI items). The 19 
Pn items scorable from a CPI protocol are sufficient 
in number to provide a reliable and dimensionally 
valid score. 

The psychological meaning of the Px scale is best 
conveyed for the purposes of this paper by a listing 
of its relationships to other established scales. It cor- 
relates in the .70s and sometimes .80s with the MMPI 
Psychasthenia scale, the MMPI Manifest Anxiety 
scale compiled by Taylor (1953), and the Anxiety 
scale developed from a factor analysis by Welsh 
(1956). All of these scales are good representatives 
of the first underlying dimension repeatedly found in 
factor and cluster analyses of personality inventories 
(cf, eg., Block & Bailey, 1955b; Cook & Wherry, 
1950; Cottle, 1950; Kassebaum, Couch, & Slater, 
1959; Tyler, 1951; Wheeler, Little, & Lehner, 1951), 
This dimension, variously measured, has proved of 
broad significance in both correlational and experi- 
mental studies (cf, eg., Block & Bailey, 1955a; 
Eriksen, 1954; Farber & Spence, 1953; Taylor, 1956) 
and it is clear now that individuals at the unfavor- 
able end of the continuum tend to be troubled, self- 
Preoccupied, and vulnerable to happenings that for 
most would go unnoticed. The choice, in particular, 
of the Pn scale to index this dimension of suscepti- 
bility to anxiety was in large part dictated by the 
Option it provided of scoring individuals from CPI 
Protocols. This choice was buttressed in addition by 
the scale’s factorial origins and its development on 
nonpathological samples, 


Subjects 


Forty-one college students in a class on factor 
analysis collected the interpersonal and CPI data 
(and factored their respective small matrices). Ano- 
nymity was preserved for the participants. Many of 
the students collecting the data later admitted having 
used themselves as subjects in order to test the psy- 
chological insighifulness of the later factor analytic 
results. For this reason, it seems likely that the re- 
Sponses of the subjects to both the interpersonal 
consistency and CPI procedures were offered with 
appropriate motivation. 


3A complete list of the CPI and MMPI items 
defining the Px scale has been deposited with the 
American Documentation Institute. Order Document 
No. 6828 from ADI Auxiliary Publications Project, 
Photoduplication Service, Library of Congress; Wash- 
ington 25, D. C, remitting in advance $1.25 for 
microfilm or $1.25 for photocopies. Make checks 
payable to Chief, Photoduplication Service, Library 
of Congress. Copies of the scale are also available 
from the writer upon request. 
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RESULTS 


For the measure of interpersonal consist- 
ency, the mean score was 71.42, the standard 
deviation being 13.73. The distribution of 
scores was somewhat negatively skewed. The 
mean score on the Pn scale was 7.35, with a 
standard deviation of 3.02. Here the score 
distribution was moderately skewed positively. 

The hypothesis originally advanced asserts 
that a significant curvilinear relationship, as 
measured by eta, should exist between inter- 
personal consistency and maladjustment. A 
scatter plot, however, shows quite clearly that 
a linearity assumption fits the data well. The 
product-moment correlation between the index 
of interpersonal consistency and scores on the 
Pn scale is —.52, significant beyond the .001 
level. Individuals who tend to see themselves 
as varying from interaction to interaction are 
also more maladjusted, as measured by the Pz 
scale. The expectation that individuals with 
too little role variability would also prove to 
have weaknesses in their Personality makeup 
was not confirmed, 

In order to understand more closely the sig- 
nificance of the interpersonal consistency in- 
dex, an analysis was undertaken of the 480 
items in the CPI. The 20 individuals with the 
highest interpersonal consistency score were 
constituted as one group and the 20 indi- 
viduals with the lowest interpersonal consist- 
ency scores were formed into a second group. 
For each of the CPI items, the relative fre- 
quencies of response of ‘the two groups were 
evaluated by means of Fisher’s exact test 
for 2 x 2 contingency tables, The number of 
items discriminating beyond the .05 level of 
significance was in excess of the number to be 
expected on the basis of chance by a factor of 
three, clear evidence for their nonchance na- 
ture (Block, 1960). The content of these dis- 
criminating items can thus serve to enrich out 
understanding of the psychological meaning 
of the index of interpersonal consistency. T° 
bring some order into the set of distinguish- 
ing items, they are presented as grouped into 
four tentative and somewhat overlapping cate- 
gories. Except where noted, for all the items 
listed, the individuals who are changeable i7 
their interpersonal role tend to say ee 
More often. Significance level and frequenci® 
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of the Yes response in the interpersonally 
changeable and interpersonally consistent 
groups are indicated in parentheses. 


Items expressing social inarticulateness and concern 


83. I usually feel nervous and ill at ease at a for- 
mal dance or party (.01, 9-1). 

134, It makes me uncomfortable to put on a stunt 
at a party even when others are doing the same sort 
of thing (.01, 9-0). 

173. My way of doing things is apt to be misun- 
derstood by others (.05, 8-1). 

198, Before I do something I try to consider how 
my friends will react to it (.05, 14-6). 

_ 200. (Affirmed more by interpersonally consistent 
individuals) In a group of people I would not be 
embarrassed to be called upon to start a discussion 
or give an opinion about something I know well 
(05, 11-18), 

r 260. (Affirmed more by interpersonally consistent 
individuals) I always try to do at least a little better 
than what is expected of me (.05, 9-16). 

285. I refuse to play some games because I am 
not good at them (.01, 12-3). 

373. My table manners are not quite as good at 
home as when I am out in company (01, 20-13). 


Items expressing personal tension and neurotic char- 
acter 


358. I dream frequently about things that are best 


kept to myself (.05, 5-0). i 
406. I have one or more bad habits which are so 
soë that it is no use fighting against them (.05, 


425. I have often felt guilty because I have pre- 
tended to feel more sorry about something than I 
really was COs, 7-1). 


453, nye " F 
mM). I work under a great dea 


1 of tension (.05, 


Ttems expressing cynicism based upon disappointment 
b a5, Most people would tell a lie if they could gain 
y it (01, 15-5). 
tee Most people inwardly li 
€S out to help other people (.05, 8-1)- 
Oth People pretend to care more abou 
er than they really do (.05, 12-4). a 
der ¢ Some people exaggerate their troubles in 
O get sympathy (.05, 19-13). 
must admit that people seme 
i (05, 20-14). 
pin expressing familial tension : 
a My parents have often disapproved of my 
ag (.01, 11-0), 
away | At times, I have been very 
from my family (.05, 19-12). adam 
With | My parents were always Very strict and s 
me (.01, 8-0). 


nj , 
3 “ies not admitting categorization nat 
Dle ty. Education is more important than most F 
ink (os, 19-13). 


dislike putting them- 


it one an- 


o times disap- 
Doing Rie 


anxious to get 
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Summarizing the sense of these discriminat- 
ing statements, it seems clear that the indi- 
viduals seeing themselves as highly variable 
in their interactions experience strain and dis- 
may in their social endeavours, view the world 
as essentially unfriendly, and are personally 
troubled. 


DISCUSSION 


The results obtained support but one slope 
of the hypothesized inverted U relationship. 
Although extreme role variability does appear 
to be related to an independent index of per- 
sonality maladjustment, extreme role stability, 
at least as represented in this study, is not 
also indicative of neurotic qualities. This par- 
tial support of the hypothesized relationship, 
moreover, is not a one time finding for an- 
other, albeit less adequate, test of the same 
hypothesis derived equivalent results. In a 
study of 50 Vassar alumnae some 20 years 
after graduation, where interpersonal incon- 
sistency was defined in almost exactly the 
way already indicated, role stability corre- 
lates again showed a consistent picture of 
(relative) psychological health. For example, 
women who are stable in their interpersonal 
role were described in ratings formulated com- 
pletely independently of scores on interper- 
sonal consistency as relatively “indulgent and 
forgiving, protective of those close to her, 
sympathetic, efficient, adequate in her sexual 
role, turned to for advice and reassurance, 
facially and gesturally expressive, and con- 
siderate.” Women who are variable in the in- 
terpersonal behavior were described as “irri- 
table and overreactive, talkative, ostentatious, 
and sarcastic.” Interpersonal consistency cor- 
related .29 with a consensus rating of degree 
of adjustment, a relationship significant at 
the .05 level. A finding by Meltzer (1957) 
to the effect that a large self—ideal-self dis- 
crepancy—a reasonable measure of self-recog- 
nized maladjustment—is significantly related 
to extreme role variability, defined by means 
equivalent to those used here, may also be 
interpreted as congruent with the present find- 
ings. It seems fair to conclude, then, that a 
positive relation exists between role variabil- 
itv as indexed here and the kind of malad- 
justment where the individual explicitly and 
consciously acknowledges his personal vulner- 
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ability. There is as yet no empirical sugges- 
tion that extreme interpersonal consistency is 
also related to maladjustment. 

When a hypothesis achieves partial but not 
complete confirmation, there is the possibility 
that either the hypothesis is wrong as stated 
or that it was not tested properly. Before 
abandoning a hypothesis held likely on other 
grounds, it is required that the measures em- 
ployed be evaluated for their adequacy and 
the sample in which the relationship was 
sought be evaluated with reference to its ap- 
propriateness for the hypothesis being tested. 
Let us consider these in turn. 

The use of a personality scale such as Pn 
to index the dimension variously called “sus- 
ceptibility to anxiety,” “neuroticism,” “ego 
weakness” and so on is, as referenced earlier, 
both well-established and well-supported. In 
a large number of studies, the correlates of 
Scores reflecting this dimension have testified 
to its importance and meaning. We may pre- 
sume, therefore, that the maladjustment in- 
dex employed in the present study is both 
representative and, on the whole, effective. 

The merits and properties of the interper- 
sonal consistency measure of course cannot be 
fully evaluated presently. In support of its 
construct validity, however, a number of argu- 
ments can be adduced. The operations appear 
to parallel the conceptual steps involved in 
deriving the notion of role variability. The 
task presented to subjects is, on the face of 
it, not especially threatening, hence removing 
much of the conscious motivation for offering 
“safe” and uninformative responses. The way 
in which the subject’s data are then processed 
So as to provide a score is sufficiently compli- 
cated and removed as to prevent a subject 
from readily controlling, when he ranks his 


Tt is important to note that the Pn scale is, in 
this and a number of other studies, orthogonal to the 
Ego Control (EC) Scale, a scale developed to meas- 
ure tendency to constrict or to express impulse. The 
EC scale correlates an insignificant .15 with role con- 
sistency, reflecting a slight tendency for over control 
to go along with reduced role variation. Although 
certain scales reasonably equivalent to Pn emphasize 
expressive or overt reactions to anxiety and de-em- 
phasize suppressive and indirect reactions, the inter- 
relations of Pu, EC, and the role consistency measure 
in the present study suggest that the failure of Pn to 
be related to role rigidity cannot be ascribed to a de- 
ficient representation of covert maladjustment, 
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adjectives, the score he later obtains. Finally, 
the theoretically appropriate if as yet insuffi- 
cient correlations of the measure with inde- 
pendent variables in several studies suggest 
its proximity, at least, to the underlying con- 
cept of role variability. E 

Perhaps the primary reason why the origi- 
nally formed hypothesis failed of complete 
confirmation in this and the other studies 
cited is that the samples involved, in all three 
cases, may have been too small and too homo- 
geneous to contain enough individuals who, 1n 
an absolute sense, were “role rigid.” This is a 
post hoc explanation, of course, but it may 
be that individuals who go on to college, 1n 
the natural course of their selective evolution, 
necessarily develop some amount of flexibility 
in their role behaviors. College students can- 
not be insensitive and unresourceful before 
the various role demands made upon them 
and therefore, in working with college sam- 
ples, very many of the individuals who are 
rigidly the same in their interpersonal en- 
deavors may already have been screened out. 

As another alternative to explore, it may 
be that extreme interpersonal consistency is 
an aspect of personality at a somewhat later 
age, when the personally desperate individual 
has found a self-definition that is acceptable 
to him. This last possibility is infirmed some- 
what by the inability to identify such indi- 
viduals in the older Vassar sample. What 1S 
required is another study, this time of a much 
larger and preferably less homogeneous sam- 
ple so that individuals who are extremely 
stable in their role behaviors may fairly be 
presumed to have been included. Although 
there has been confirmation and cross-valida- 
tion of the hypothesis that extreme role bie: 
ability will be associated with personality mä 
adjustment, it is premature, we would sugges” 
to abandon the additional hypothesis that E 
treme interpersonal consistency is also E 
ciated with personality maladjustment. A fu : 
ther empirical effort is needed to discov 
whether there is a far side to the mounta i 
In the meanwhile, to the extent that Er 
variability as measured here relates t0 ra 
diffusion as conceived by Erickson, the Pa 
ent study offers support for the implica" ” 
he has drawn between ego identity an 
havior. 


Ego Identity, Role Variability, and Adjustment 


SUMMARY 


From Erickson’s concept of ego identity, 
the dimension of role variability was ab- 
stracted. The hypothesis was advanced that 
excessive role variability (“diffusion”) and in- 
sufficient role variability (“rigidity”), be- 
cause they both reflect problems in ego iden- 
tity, would both be associated with maladjust- 
ment. A measure of the extent to which an 
Individual perceives himself as varying in a 
variety of interpersonal situations was devel- 
Oped. Role variability, so measured, proved 
to relate significantly to maladjustment, as 
Measured by a CPI scale to measure “sus- 
Ceptibility to anxiety.” Role rigidity did not 
relate to maladjustment. Supplementary find- 
gs were introduced and some possible rea- 
Sons for the only partial confirmation of the 
curvilinear hypothesis were offered. 
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SOMATIC EXPERIENCE IN THE ANXIETY STATE: 


SOME SEX AND PERSONALITY CORRELATES OF 
“AUTONOMIC FEEDBACK”? 


SHELDON J. KORCHIN °? ann HELEN A. HEATH 
Michael Reese Hospital, Chicago, Illinois 


One facet of the anxiety state is the ex- 
perienced alteration of somatic functioning. 
Recent work by Mandler and his associates 
(Mandler & Kremen, 1958; Mandler, Mand- 
ler, & Uviller, 1958) has again called attention 
to this aspect of anxiety, termed by them “au- 
tonomic feedback.” An Autonomic Perception 
Questionnaire (APQ) was developed for the 
self-description of somatic symptoms charac- 
teristic of subjects’ anxiety experience. They 
have investigated whether and how the num- 
ber and/or intensity of such symptoms is re- 
lated to other measures of anxiety and to au- 
tonomic reactivity under stress as measured 
directly. In their first study, which compared 
selected high and low APQ scorers, high sub- 
jects more commonly reported somatic sensa- 
tions during intellectually stressful tasks, and 
generally showed greater autonomic reactivity 
in terms of polygraphic measurements (Mand- 
ler et al., 1958). While generally replicating 
these findings, a second study of unselected 
subjects found less distinct relationships be- 
tween APQ scores and either somatic report 
or autonomic reactivity under the same stress 
(Mandler & Kremen, 1958). Correlations be- 
tween the APQ and both the Taylor Manifest 
Anxiety scale and a newly developed Body 
Perception Scale were positive in both studies, 
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Grant M-1442 of the National Institute of Mental 
Health, The authors are grateful to John Dubocaq, 
Dean of Students at George Williams College, for 
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and facilities and in engaging the cooperation of his 
students and colleagues. Our thanks, too, to Barbara 
White for help in test administration and scoring. 

2 Now at the National Institute of Mental Health, 
Bethesda, Maryland. 


though considerably lower in the second. The 
present study is designed to replicate and ee 
tend these findings, and generally to explore 
further the personality correlates of autonomic 
feedback.* 


METHOD 


The Nowlis Adjective Check List (ACL), Darr 
Ego Strength scale (Es scale), Taylor Manifest den 
iety scale (MA scale), and the Autonomic Percepi 


N 

2 For consistency with the earlier work, Mantle 
terms will be used in this paper. However, the 
inclusive and neutral term “somatic PARETET 5, 
seems to describe better the range of phenoment s 
cluded in Mandler’s “autonomic perception” oF ay 
tonomic feedback.” Apparently, Mandler intende i 
these terms to describe sensitivity to the ee in 
bodily sensations which subjects might deseri eb 
emotional states, regardless of whether they tainly, 
autonomic nervous system activity as such. oe ae 
different orders of neurological and physlologi 
trol are suggested by such diverse APQ items a 
7. When you feel anxious, are you aware of 
creased muscle tension? 

8... . do you get a headache? 

16....do you feel as if blood rushe 
head? ing in 
19. . . . do you get a sinking or heavy feeling 
your stomach? 


s to your 


physi" 


Just as “autonomic” suggests a too pare «foe 
logical mechanism, the words “perception” an Jation~ 
back” connote too immediate and definite a re apeti- 
ship between the physiological event and the © to 
enced symptom. Precisely how experienced symp he 
are related to measurable physiological activity 
research question at the core of Mandler’s ¢ Jarit)” 
though not an issue in this report, But, for S ya- 
it should be remembered that APQ assesses ft 
riety and intensity of self-reported bodily Pap ainic?! 
and that it defines a variable akin to what 1m hyp” 
contexts is described as somatization or even per- 
chondriasis, Indeed, it would be appropriate 27° gret 
haps clearer to distinguish high and low AP 

as “somatizers” and “nonsomatizers.” 


erjen! 
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~~ ae 


ti as 


Correlates of “Autonomic Feedback” 


Questionnaire (APQ) were administered, in that or- 
der, to better than half of the entire student body of 
George Williams College in Chicago. Each of the four 
college classes was tested in a separate session. The 
study, it was explained, was intended to explore some 
sations among test measures of emotional states. 
tudents were assured that individual scores would 
be confidential, and that only group results might be 
discussed with their college authorities. The present 
report is based on the scores of 176 subjects (139 
men and 37 women) for whom complete protocols 
were available. Subjects ranged in age from 16 to 40 
years, 
143004 Adjective Check List. The list consists of 
O adjectives (including 10 duplicates) which the 
Subject rates on a four-point scale as more or less 
characteristic of his mood state (Nowlis, 1953). Fac- 
tor Analytic studies reported by Nowlis and Green 
eed Gri 1 & Nowlis, 1957) reveal eight factors: 
D. Concen ș tion; B, Aggression; C, Pleasantness; 
T Activat. .n-Deactivation, a bipolar factor; E, Ego- 
ce F, Social Affection; G, Depression; and H, 
Nene The original instructions request that the 
a reply in terms of his present mood. In this 
fee Y, he was asked instead to judge the items in 
“ae of what is generally characteristic of him, since 
i concern is less with the subject’s momentary mood 
Pian With his perception of more enduring modes of 
of Stional behavior. As an additional rough estimate 
y motional lability, which will not be treated in 
ae Teport, the subject was also asked to circle all 
; “Jectives which described feelings he had during the 
Preceding 24 hours, : 
em arron Ego Strength scale. This scale was derived 
i Pirically from the MMPI in terms of items which 
Stinguished patients who benefited from psycho- 
ORA from those who did not (Barron, 1953, 
ures i Fr om correlations with other assessment meas- 
ton in additional patient and normal samples, pir 
inci e ptats the test as a measure of ego strength, 
Ogic: TA such characteristics as health and piye 
Auat stability, strong reality sense, feelings of ade- 
tanei, vitality, lack of prejudice, emotional spon- 
Y and outgoingness, and intelligence. p 
eg? Manifest Anxiety scale. This procedure is 
origi o derivative of the MMPI, consisting of items 
Sinally selected by clinical psychologists as ex- 


Plifying the manifest symptoms of anxiety (Tay- 


lo 
r 
ales were ad- 


minis >>>, The Taylor and Barron sc 

Wee in a combined form. iie 
tie oaie Perception Questionnaire. The ompr 
‘onnaire described by Mandler, Mandler, an 


vil z 
ing oe (1958) was administered and scored accord 
Jevant portion, 


for © their procedure. The most re p 

item, © Present study, consisted of 21 graphic scale 
Bane Which comprise their “Anxiety APQ” score. 
“Wt Of these items starts with the dependent claue 


the = You feel anxious . . .” and then inquires i 
OF ga dency an intensity of symptoms in e 
S y and/or intensity Por example, the 


item wen areas of bodily function. 

A ome 
tolda»; hen you feel anxious, do your _ a oe 
lq» 18 rated m “No change” to 

d. on a scale fro: i ad 
N additional nine items describe symp 
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TABLE 1 


Sex DIFFERENCE IN M Autonomic PERCEPTION 
QUESTIONNAIRE, TAYLOR IFEST ANXIETY SCALE, 
AND Barron EGO STRENGTH SCALE 


Test Men Women t 
Anxiety APQ 66.30 81.00 2.726* 
Pleasure APQ 16.96 19.46 1.211 
MA scale 12.66 17.65 3.698** 
Es scale 48.95 46.43 2.620* 
*p <01. 
sp < 001. 


related to pleasure. These are of the same form, 
though starting with the clause “When you feel 
happy . . .” The sum of these nine items provides a 
“Pleasure APQ,” which will be of lesser concern in 
this report. 

An additional group of nine anxiety items was de- 
veloped to extend coverage in the areas sampled, 
and to tap further somatic areas (e.g., faintness, 
nausea, polyuria, diarrhea). Scores based on these 
new items correlated highly with the original, 21- 
item, Anxiety APQ—r=.736 (male subjects) and 
r=.739 (female subjects), both significant at <.001 
level. Moreover, the new and original APQ scores 
vary in precisely the same way with each of the 
other personality measures. However, for greater 
comparability with earlier work, the Anxiety APQ 
analyses reported in this paper are based only on the 
original 21 items. 


RESULTS AND DISCUSSION 


Sex Differences in Mean Test Scores 


There are distinct and significant differences 
in mean APQ, MA scale, and Es scale scores 
between men and women (Table 1). Women, 
in this population, are higher in manifest 
anxiety, lower in ego strength, and report 
more somatic experience. The female mean 
Pleasure APQ is also higher, though insignifi- 
cantly so. 

The magnitude of these differences is puz- 
zling, particularly since none of the original 
reports of these procedures describes similar 
sex differences in seemingly comparable popu- 
lations. Although women had slightly higher 
MA scale values in the original Iowa sample 
(Taylor, 1953), the difference was insignifi- 
cant, Mandler and Kremen (1958) found no 
difference in mean APQ between Harvard 
summer school men and women, though they 
did discover some differences in autonomic 


response measures. Similarly, Barron (1953) 
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TABLE 2 
INTERCORRELATIONS AMONG AUTONOMIC PERCEPTION QUESTIONNAIRE, TAYLOR MANIFEST ANXIETY 
SCALE, AND BARRON EGO STRENGTH SCALE FOR MALE AND FEMALE SUBJECTS 
Anxiety APQ Pleasure APQ MA scale Es scale 

Test M F M F M F M F 
Anxiety APQ 529* — 509* .410*  .634* —.258* a 
Pleasure APQ 167 .288 —.300* im 
MA scale -. 


Note.—Male N = 139; female N = 37, 
*p <01. 


mentions no sex difference in ego strength 
scores, 

In view of the decided sex differences in 
these measures, it is interesting that sexes do 
not differ at all in the ACL factor scores 
for anxiety and depression. Indeed, only one 
of the nine ACL scores (Social Affection, p 
< .05) is significantly higher for women than 
for men. There is some tendency for women 
to be higher in Activation, but only at the 
P< .10 level, 

There is no ready explanation for the siz- 
able sex differences, However, in view of them 


Relationship of Autonomic F, eedback to Mani- 
fest Anxiety and Ego Strength 


Sexes, it is clear that those who report more 


negative correlations, The correlation pattern 
supports Mandler’s contention that autonomic 
perception is part of the anxiety complex. 
Subjects who are less capable of integrative 
functioning and who are simultaneously more 
prone to emotional disturbance experience a 
wider range and more intense somatic symp- 


toms. The APQ vs. MA scale correlations of 
this study are of about the same order as 
those reported by Mandler et al. (1958) in 
their first Teport, though somewhat higher 
than the coefficient of -267 between Anxiety 
APQ and MA scale reported in their second 
study (Mandler & Kremen, 1958). Although 
the correlations between procedures within 
each sex group are generally in the same 
range, it is noteworthy that those for women 
are in all cases higher. 


Relationship of Autonomic Feedback to Ad- 
jective Self-Description 


In order to compare autonomic feedback t0 
subjects’ ACL self-description, the male na 
female distributions of Anxiety APQ we 
were divided into near-equal thirds; k S 
forming high, median, and low APQ moa 
of men and women. These APQ groups een 
compared, first, in terms of the Nowi 
factor analytically defined variables (Tab eM 
and, second, in the distribution of He i- 
ings on the 130 adjectives considered a the 
vidually (Table 4), Three-way split rvi- 
APQ distribution was used to detect cU 
linearity, if present. Although the mean fell 
scores of the median APQ group often ups; 
close to, or beyond, one of the extreme a re- 
in none of the significant comparisons Fip 
sented in Tables 3 and 4 was the relate 
clearly curvilinear, Hence, these finding? ee 
be taken as generally descriptive of the d 
ences in ACL variables between higher sort 
lower APQ Scorers, of a roughly linear pQ 
and certainly characteristic of extreme 
groups. 

The two factor variables which bes d 
tinguish the APQ groups are Anxiety a” 


dis 
l; e 


| 
| 
| 
| 


A 


— 
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Pression (Table 3). Among both men and 
women, high APQ subjects are significantly 
higher in their mean Anxiety and Depression 
Scores than the median and low APQ scorers. 
Suggestive, though less impressive, differences 
are found in the sets of scores defining the 
two poles of the Activation factor. Both male 
and female high APQ subjects tend to de- 
Scribe themselves in Deactive terms (though 
the F ratio is not significant for women). 

Owever, among women, high APQ scores 
also tend to be higher in Activation, while the 
men if anything trend in the opposite di- 
rection., 

More detailed examination of the indi- 
vidual adjective ratings amplifies these find- 
ings, and makes clearer the apparent sex dif- 
ference in Activation. In Table 4 are listed 
the adjectives in which the three APQ groups 
differ, on chi square analysis, at five levels of 
Significance, It might be noted that one would 
€xpect by chance 26 significant comparisons 
at the .10 level or better in the 260 compari- 
Sons (130 adjectives, 2 groups). In fact, there 
are 56, over twice as many. Of greater impor- 
tance, however, is the internal consistency 
and apparent sense that can be made from 

€ pattern of adjective self-descriptions. 

High APQ male subjects, contrasted to 

Ose reporting fewer and less intense somatic 
Symptoms, describe themselves as inadequate, 
active, and helpless people. The self-image 
Ol Weakness and incompetence is conveyed by 
the positive endorsement of such items as 
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weak, helpless, hesitant, and the like; and by 
low ratings on items which suggest active 
mastery, such as independent, resourceful, 
and effective. Emotionally, the adjective pat- 
tern suggests depression and feelings of fu- 
tility—ashamed, downhearted, clutched up, 
frustrated—rather than any acute emotional 
distress such as might be expressed in hos- 
tility or anxiety. They are defeated, rather 
than angry men. Overall, one has the impres- 
sion of ego-weak, inadequate, and dependent 
people whose symptoms are more those of 
“neurotic debility” than of acute emotional 
distress. 

While high APQ female subjects, compared 
to their low APQ sex mates, also convey a 
distinctly neurotic impression, the pattern of 
adjective self-description differs from the male 
and suggests a more complex syndrome. As 
with the men, there are signs of inadequacy 
and weakness, but along with this consider- 
ably more evidence of stronger and more 
labile emotions—belligerent, lonely, over- 
joyed, irritated, angry. These subjects seem 
to experience higher levels of excitement, with 
wider swings of mood and activity, and more 
capacity for energetic striving, though per- 
haps with no more assurance of success than 
the men. Compared to their male counter- 
parts, high APQ females seem less concerned 
with their inadequacy, while describing more 
hostile and generally emotional interaction. 
They are more active and aggressive, men 
more passive and self-doubting. In terms of 


TABLE 3 


Comparison or Nowiis ADJECTIVE 
AUTONOMIC PERCEPTION QUESTIONN: 


Cueck List Factor SCORES FOR Low, MEDIAN, AND HIGH 
AIRE SCORERS, MALE AND FEMALE Sus, 


JECTS SEPARATELY 


SSS = zi 


Female APQ groups 


Male APQ groups 
i 7 Low Median High F p 
Low Median High F bo 
: 8.2 8.2 13 
Ae cttration w gs 80 o om a 0 done A 
vessi 5 5 5.8 ‘ 
mem POR uoga É BO M om a 
ati 5 8.6 62 ns — a: l 0 5 
Potaa 22 A s gu <ë 42 42 S8 202 ns 
3 ism (Œ) D=) aa 39 47 170 ns 35 n = A ns 
Socia] Aea n 41 92 9.0 64 ns P3 = oe e i 
AgPtession (O) m ha 3.8 5.1 5.52 <.01 z A o p 
n 2 By Sh 4 5 <.005 i 32 3O 5i j 
tety (H) 3.5 3.0 44 635 <.005 ‘ 


37. 


No f 
*VarigaMlale N = 139; female N = Bl efore, the Kruskal-Wallis 


ari f 
“ances were not homogeneous; t 


H test was substituted for the F test. 
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TABLE 4 


ADJECTIVE SELF-RATINGS WHICH SIGNIFICANTLY Dir- 
FERENTIATE HIGH, MEDIAN, AND LOW ANXIETY 
AUTONOMIC PERCEPTION QUESTIONNAIRE 
SCORERS WITHIN MALE AND WITHIN 
FEMALE GROUPS 


Adjectives* which differentiate 
high, median, and low 


APQ among: 
Significance level Men Women 
<.001 jiltery 
doubtful 
shocked 
<.01 helpless insecure 
inactive full of pity 
thirsty sleepy 
—wideawake belligerent 
lonely 
<.02 ashamed downhearted 
washed out regretful 
grouchy overjoyed 
— independent — calm 
— effective 
<.05 downhearted helpless 
startled startled 
frustrated energetic 
hesitant skeptical 
slow irritated 
bored angry 
clutched-up — optimistic 
weak 
— resourceful 
— bold 
— careful 
<.10 self-conscious self-conscious 
jillery frustrated 
insecure hesitant 
dissatisfied careless 
timid boastful 
smug subdued 
blue restrained 
— serious 
— alert 
— satisfied 


Note.—Male N = 139; female N = 37. 

A Adjectives are recorded here exactly as they appear in the 
ACL. If there is an inverse relation between APO level and 
adjective rating (ie., Highs rated lower than lows), a minus 
sign appears before the a jective. Adjectives which appear in 
both the male and female list are italicized, 


the sex norms of our culture, one might specu- 
late that the core neurotic problem expressed 
by each high APQ group represents inability 
to meet its particular sex standards: the males 
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are too dependent and ineffectual to be men; 
the females are too hostile and energetic to be 
women. The two adjective patterns suggest 
the “castrated male” and “penis-envy female 

syndromes of psychoanalysis. Less specula- 
tively, it can be concluded that ACL analyses 
fully support the earlier presented correlative 
analyses in showing individuals reporting 
greater autonomic feedback to be generally 
more neurotic, less well integrated, and more 
prone to emotional disturbance, while further 
suggesting personality characteristics at odds 
with effective sex role functioning. 


Somatic Symptom Choice 


It may be of some general interest to note 
the relative popularity of various somatic 
symptoms contained in the APQ, and to con- 
sider whether the sexes differ in their symp- 
tom choice. For rough and exploratory analy- 
sis, the 30 item means (original 21 plus our 
additional 9 items) were separately ranke 
for women and men, to compensate for i 
sex differences in mean scores already note! j 

In general, men and women are quite sim) 
lar in their relative orders of symptom hoa 
Below are listed the five items receiving oe 
highest ranks and the five ranked lowest the 
the entire student group. In parentheses, 
rank for men, then for women, is indice 
Those items added to the original APQ 4 5 
marked with an asterisk. Recall that all E 
are preceded by the stem, “When you 
anxious. . . .” 


Most highly rated items: + our 

do you get a fluttering feeling in Y 
stomach (“butterflies”) ?* (1, 1) 

how often are you aware of bodil 
tions? (3, 2) 

are you aware of increased muscle tens 
(2, 6) 

do you ever feel weak or shaky?* (6; 2A 

are you aware of many bodily react! 
(4, 7) 


Least highly rated items: 
do you experience nausea?* (30, 30) 4 
do you feel as if you might faint?* (29 


ia 
do you have to defecate frequently 
thea) ?* (28, 28) 


y reac 


jon? 


21) 


i 


f 
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do you experience a slowing of the heart?* 
(27, 29) 
do you get a headache? (26, 23) 


Note that four of the five least common 
symptoms are in response to items introduced 
by us. Perhaps such symptoms occur only with 
higher degrees of anxiety than these essen- 
tially healthy young men and women have 
experienced. 

Seven of the items had identical ranks, or 
tanks differing only by one unit. In addition 
to the four included in the lists above, men 
and women were essentially alike on: 


does your mouth become dry? (10, 10) 

do you have to urinate frequently? (15, 16) 

are you bothered by your bodily reactions? 
(17.5, 18.5) 


By contrast, inspection of the seven items 
which most differed in the rankings for men 
and women suggests some sex differences: 


do your hands become cold? (25, 8-5) 

do you have diffgulty talking? (19, 8-5), 

does the intensity of your heart beat in- 
Crease? (11, 20) : 

how often are you aware of change in your 
breathing? (12, 21) 

does your stomach get upset? (20, 13) 

do you perspire? (5, 11.5) 

© you get a lump in your throat or choked- 

up feeling? (9, 4) 


Though the overall picture is one of greater 
agreement than difference in the somatic ways 
in which men and women express anxiety, the 

ifferences that are found seem consistent with 
clinical psychiatric and psychosomatic exper 
ence, Thus, women more commonly name o 
Rnd, reminiscent of Raynaud’s disease pia 
More frequently found among women, an 


difficulty in talking and lump in the throat, 


Bgestiy i ugh the differences 
e of hysteria. Thoug avor cardiac 


© somewhat less reat, men favor 
anq respiratory a perspiration, ang 

Scle tension, which taken together suggest 
ang normal physiological reactions to soe 
cop, Xettion. Cardiovascular diseases a 5 <i 
dicap Uy found among men, at T tat 
An ted by mortality statistics 1" mi rage 
Psyop Portant and long standing questi se 

somatic research, to which these 
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may contribute slightly, is whether there are 
individual and group, such as sex, differences 
in somatic response in the acute emotional 
state which might be predictive of “symptom 
choice” in the psychosomatic disease state. 


Relation between Anxiety and Pleasure APQ 


Thus far, attention has been centered on 
somatic experience in anxiety (the anxiety 
APQ score), and consideration of the signifi- 
cant positive correlation between Anxiety and 
Pleasure APQ noted in Table 2 has been by- 
passed. In their original study, Mandler, 
Mandler, and Uviller (1958) report correla- 
tions of .50 and .45, for two male samples, 
between the two autonomic feedback scores. 
Although little systematic attention is given 
to this relationship—and in their later report 
Mandler and Kremen (1958) do not treat 
Pleasure APQ at all—these findings suggest 
that somatic sensitivity may be an individual 
characteristic in all or many states of emo- 
tional arousal. Generally anxious and ego- 
weak people, as we have already seen, are 
more prone to such experience, but apparently 
when happy as well as when anxious. At the 
same time, all subjects report more somatic 
experience in anxiety than in pleasure. Com- 
parison of the duplicated items in the two 
scales shows consistently and significantly 
higher ratings for the Anxiety than for the 
Pleasure form. Apparently, then, while all 


subjects experience more somatic involvement 
in anxiety, individual differences in sensitiv- 


ity to such experience are consistent across 
both states. 
SUMMARY 


his paper reports a correlative study of 
n based on the Mandler 
Autonomic Perception Questionnaire, Barron 
Ego Strength scale, Taylor Manifest Anxiety 
scale, and Nowlis Adjective Check List scores 
of 139 male and 37 female college students. 
Women are significantly higher in autonomic 
feedback scores, lower in ego strength and 
higher in manifest anxiety, though the within- 
sex correlations were substantially the same 
for both sexes. In both groups, autonomic 
perception correlated positively with manifest 
anxiety and negatively with ego strength; the 
latter two correlating negatively between 
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themselves. High APQ scorers had signifi- 
cantly higher scores on the Nowlis-Green fac- 
tor variables of anxiety and depression. How- 
ever, within each sex group, proneness to au- 
tonomic feedback may characterize those who 
are least successful in their sex roles. In ad- 
jective self-descriptions, high APQ males re- 
veal themselves as ineffectual, dependent, and 
depressed, high APQ females as more active 
and aggressive with stronger and more labile 
emotions. Thus, it may be concluded that Te- 
porting more numerous and intense somatic 
experiences in emotional states is generally 
characteristic of more neurotic, inadequate, 
and anxious individuals, though the particular 
neurotic problem may differ for each sex. 
Though more autonomic feedback is re- 
ported in anxiety than in happiness, the posi- 
tive correlation between autonomic feedback 
scores in the two states suggests that the 
tendency for somatic involvement, at least as 
assessed by the APQ, is a general character- 


istic of individuals in a variety of emotional 
states, 


Sheldon J. Korchin and Helen A. Heath 
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THE BENDER GESTALT: 
A CLINICAL STUDY OF CHILDREN’S RECORDS 


WENTWORTH QUAST 
University of Minnesota Medical Center 


_ The need for indicators of brain dysfunc- 
tion at all ages is apparent to clinical psy- 
chologists, but is particularly acute in the as- 
Sessment of children where changes with age 
and wide variation of many growth charac- 
teristics within an age range complicate the 
Problem. Incidence figures for brain injury in 
children in this country are highly variable, 
although approximately three million is an 
estimate made by Martha Elliott (1956), 
Chief of the Children’s Bureau. This figure 
es not include an additional number of chil- 
ren with behavior and learning disorders pre- 
Sumed to result from organic pathology. 

One of the main problems confronting the 
child Clinical psychologist is in the relative 
Weights to be assigned to emotional, func- 
tional, dynamic, or learned components as 
Contrasted with intrinsic, constitutional, or 
Pranic components in the unusual symp- 
tomatology presented by the child patient. 
ig S, Measure of presumably intrinsic defects 
vi difficulty in coordination as evidenced . 
eyt-motor tasks. For research and elinin 
moence that organic patients do have anual 
yia difficulties, the reader is referred to m 
sios by Klebanoff (1945) and Klebanofi, 

nger, and Wilensky (1954). The present 

“dy was undertaken to test the validity of 
ender as an indicator of organic com- 
mae using 100 child subjects. either = 
Chines or outpatients of the Division 0 

ilq Psychiatry, University of Minnesota 

edica] Center. o 
© problem of establishing clear criteria 
e Pi og injury in children is complicated “ 
tior Subjective nature of neurologic eram e 
in p by problems of reliability and vali y 
lag. 2G interpretation, and by the frequen 

Kor; : rsical changes 1n 

any demonstrable physica ! 


for 
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a child with known central nervous system 
defect. A survey of the final diagnoses of 325 
consecutive inpatient admissions showed acute 
or chronic brain syndromes to constitute about 
25% of the population. Considering the lack 
of definitiveness in making such a diagnosis, 
this figure is probably low. In the search for 
criterion groups it was felt more meaningful 
to examine the original impressions of refer- 
ring physicians or agencies, the presenting 
complaints, or the differential diagnoses con- 
sidered at the time of admission to hospital 
for suspicions of brain injury. When these 
are considered and the problem becomes one 
of examining for possible central nervous sys- 
tem involvement, the proportion of suspected 
brain injured becomes more nearly 50% of 
the clinic population. If the problem of men- 
tal deficiency were to be included, the prob- 
lem of determining “familial” versus “or- 
ganic” etiology is also at least a 50-50 base 
rate problem. It should be clear that the terms 
“brain injury” and “suspected organic brain 
damage” are used in a categorical sense, to 
indicate a wide range of central nervous sys- 
tem defects. Also it should be clear, while in- 
accuracies on the Bender beyond a certain 
age may be related to cerebral defect, the ex- 
tent of such defect is not implied. The test is 
examined as a screening device capable of 
separating out those children who warrant a 
“second look” neurologically. 


METHOD 


Since it was possible to identify some homogeneity 
among various presenting complaints of child psy- 
ciatric patients, criterion groups were designated ac- 
cording to whether or not brain damage was sus- 
pected: (a) those with suspected emotional disturb- 
ance without suspected brain damage and (b) those 
with suspected brain damage, with or without emo- 
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tional disturbance. Notations regarding any suspected 
organicity appeared in the hospital chart in the oe 
of a physician’s referral specifically for eee 
study; in the form of behavioral complaints | rom 
physician or schools related by them to brain gi soe 
by inference, such as extreme hyperactivity, shor 
attention span, impulsivity, etc.; and in the form 
of notes regarding questionable or known seizures, or 
notes as to sequelae from illnesses or trauma. 

The emotionally disturbed group consisted of pa- 
tients in whom brain injury was not mentioned as 
suspect by any person in the chain of individuals 
from the original source of referral through the final 
examining physicians in hospital. Presenting com- 
plaints common to this group were fears, obsessions, 
lying, stealing, truancy, pathological shyness or with- 
drawal, allergies, and somatic complaints. No child 
whose presenting complaint was mental retardation 
was included in either group. 

The resulting two groups were matched relative to 
their socioeconomic level in proportions according to 
father’s occupation using the Minnesota Scale of 
Parental Occupation. The lower age limit was 10 (9 
years 10 months), and the upper age limit 12 (12 
years 11 months). A limit of 1 year’s age span would 
have been most desirable to insure homogeneity in 
this regard but sufficient numbers were not available 
within such a narrow limitation. Children aged 10 
were chosen since normative data previously obtained 
by Quast (1957) indicated absence of most Bender 
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deviations by that age. A number of “organic signs” 
on the Bender seen even in adult records occur as 
normal developmental phenomena up to age 8 but 
usually not beyond 10. Thus, such an age selection 
controlled developmental deviation to large extent. 
It was impossible to make the sex ratio comparable 
to the population at large since the usual clinic ni 
averaged two boys to one girl. The final selection o 
patients was considered to be a representative sam- 
le of the clinic population. 
. The mean IQ for the emotionally disturbed group 
was 99.5, with a standard deviation of 15.72. er 
IQ for the suspected brain damage group was 81./, 
with a standard deviation of 16.55. The aner 
in mean IQ was a reliable one with a £t of ee 
$ = < .0001; however, previous data (Quast, 195 
would indicate IQ per se not to be a significant de- 
terminant in Bender performance. Also, the ae 
mental age of the brain damage suspects (9 years f 
month; SD, 2 years 1 month) exceeded the cuto 
mental age 8, below which deviations occurred as 
normal developmental phenomena. a be 
The two groups were similar in that they coul 
considered “typical” child psychiatric patients. T 
shared common referral sources and represented 
wide range of problems. , a8 
The Benders were administered by the ann 
by advanced graduate students during their B 
pital internship in clinical psychology. All tests e 
scored according to the Peek-Quast system (1 


TABLE 1 


RELATION OF BENDER 
BRAIN DAMAGED 


ATTRIBUTES TO CRITERION Groups (SUSPECTED 
AND SUSPECTED EMOTIONALLY DISTURBED) 


a 


iye! 
% Brain Damaged % Emotional Relation of Level % No pct 
Suspects Suspects Sign to Criterion of Ages | Sign 
Showing Sign Showing Sign Groups in Signifi- Showing |) 
Attribute or sign (N = 50) (N = 50) Terms of ¢ Sones Bas- 3 
Ba a oa M , 
Dashing 24 4 288 01 4 
Perseveration 48 12 .393 .01 ! 
Rotations, 2 or more, +2 or 6 
+3, not Figures a, 3 32 2 399 01 3 (l 
Reversal 26 0 .387 01 1 
Confabulation 26 0 387 01 3 
Angulation, +2 40 2 483 01 i 
Mixed orientation 36 36 .000 — F 
Card turning, inversion 24 12 -156 — 7 
Major distortion 56 4 567 01 19 
Erasures, absence of 50 26 247 05 es 
Excessive pressure 46 50 — .040 — a 
Separation, +2 20 0 .333 01 28 
Flattening 24 22 .024 -— 2 
Exaggeration 64 42 .220 05 s 
Line substitution 0 0 -000 — g 
Slope 66 34 .320 01 : 
Global clinical impression 80 12 682 01 pee { 


a Presented for convenience in comparison, 
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TABLE 2 
INTERCORRELATIONS BETWEEN ATTRIBUTES CHARACTERISTIC OF THE 


a_a CR 
2 3 4 5 

—.20 +.43 —06 +.06 

mtb —.19 .00 

—.17 +.02 

+.14 


COND Ui U N = 


10 


Oe 


> Note.—Correlations are phi coefficients. 


SUSPECTED BRAIN DAMAGE SAMPLE 


1 = Scalloping, 


6 7 8 9 10 
Ce oe Yn 00 
+-.06 +.14 .00 .00 “+13 
+.42 +.28 +.03 +12 as Say) 
+.27 +.10 +.18 +.30 +.25 
+.38 +.57 +25 +49 

| +56 | +24 1 E2 

+.30 +.04 

+.39 


2 = Dashing, 3 = Perseveration, 4 = Rotation, 5 = Reversal, 


aon Confabulation, 7 = Angulation, 8 = Major Distortion, 9 = Separation, 10 = Slope. 


by the writer without knowledge of the patient’s 
name or classification. A global judgment was made 
On appraisal of the record as a whole, on a no brain 
amage versus brain damage dichotomy, prior to 
Scoring, 

Analysis of the scored records consisted of com- 
Paring the two groups for the presence or absence of 
* number of Bender attributes or signs, found in the 
“xaminer’s experience to be most common in the 
records of brain injured children and adults. An 
n Priori selection was made of 17 attributes most 
'Xely to discriminate the two groups. 
së i em validity was obtained by the use of Jurgen- 

os tables (1947) for the determination of phi co- 
p ients, and levels of significance were calculated. 

Or those attributes showing the best discrimination 
a level), intercorrelations were also calculated. 
Cor = Scores were assumed to be dichotomous so o 
fete ation method selected was the phi coena 
re] Or the null hypothesis can be made through p s 
S tionship to chi square. Chi square = V?. If chi 
ane is significant in a four-fold table, the jaa 
cifi nding phi is significant at the same level. pe- 

“ally, where N= 100, chi square is significant at 
if it 3 level if it is at least 6.6, and at the .05 level 

IS at least 3.8. 


RESULTS AND DISCUSSION 


a results of ‘the comparisons between 
a < Performances of the suspected brain mee 
Bror and the suspected emotionally SIN i i 
oO are presented in Table 1 yas : 
to. Ses the relation of the Bender attri ie 
cienț, C terion groups in terms of phi coeth- 


am r and the levels of significance. Ae 
tta the 10- an - 
oa ene Beye for com- 


Day US combined are presented 


tis Seen that 10 of the attributes separate 


the two groups at the .01 level of significance. 
Two others showed significant differences be- 
tween the groups at the .05 level. The fact 
that clinical impression of the record as a 
whole separated the groups best is not sur- 
prising when it is learned that the suspected 
brain damage group averaged 3.6 attributes 
per patient as contrasted with a mean of .60 
for the suspected emotionally disturbed. False 
positive Bender signs in the suspected emo- 
tionally disturbed occurred to a significant 
extent on only 1 of the 10 discriminating at- 
tributes. This attribute, Slope, while occur- 
ring in two-thirds of the brain injury suspects, 
occurred in one-third of the suspected emo- 
tionally disturbed. The busy clinician should 
note that.the most discriminating signs were 
generally also the easiest to score. These data 
suggest that a cutting score based on the 
signs which give optimal discrimination, pos- 
sibly weighting the better signs, could be de- 
veloped through a cross-validation study. 
When intercorrelations were calculated for 
these 10 attributes (Table 2) it was found 
that in general they appeared to show little 
intercorrelation with one another. Because of 
interest in the underlying strength of relation- 
ship between signs rather than in making pre- 
dictions from one sign to another, and rec- 
ognizing the restriction which reduction of 
frequencies to 2 X 2 tables places upon phi 
coefficients, the obtained phi’s should be in- 
terpreted for size in light of the maximal phi 
coefficients possible. About 8 of the 45 inter- 
correlations were appreciably affected by such 
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considerations (for example: ¢ismax = 65, 
when obtained $ = .43; ¢semax = -61, when 
obtained ¢ = .42). 

These data are consistent with the hypothe- 
sis that the Bender may be tapping a variety 
of defects which may exist, in combination or 
separately, in an individual patient. If this is 
true and each sign is valid, and the probabil- 
ity of their occurring in patterns is low (low 
correlation), then a count of critical signs 
takes on considerable significance for diag- 
nostic purposes. Validity of a total score is 
much enhanced when separate parts or items 
are valid individually but modestly intercor- 
related. However, a combinatory sign ap- 
proach may be ultimately less helpful in terms 
of explanation of the deviations than a fur- 
ther exploration of a particular sign. 

For example, while the attribute Persevera- 
tion can logically be linked with other known 
Perseverative behaviors and thinking of the 
brain injured person, Scalloping, Rotations, 
Angulation, and/or others, may be linked with 
defects hitherto unexplored. Attention to in- 
dividual signs may also throw light on the 
problem of whether a more fundamental, gen- 
eral disturbance (for example, “integration”) 
may be involved which is expressing itself in 
a variety of ways, rather than associating 
signs with localized brain areas. That a more 
central, general, integrative phenomenon is op- 
erant is suggested in this sample by the find- 
ings that some visual motor disturbance was 
common to the group despite the wide varia- 
tion of central nervous system complaints. 

Seriously ill patients have at times been 
able to describe the reasons for their devia- 
tions, and several of their explanations have 
been illuminating. With some consistency pa- 
tients in acute neurologic states have described 
a conservation of energy as the main reason 
for certain deviations. In verticalizing hori- 
zontally oriented figures, for example, pa- 
tients have stated it “easier” to make flexor 
rather than extensor motions. Other acutely ill 
patients have described with some distress 
their inability to make an angle in one direc- 
tion while demonstrating facility with angula- 
tion in another. That this difficulty in lateral- 
ity has an appropriate cerebral morphological 
correlate or that defects in circumscribed cor- 
tical sensory or motor areas are at least “in- 
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strumental” prerequisites, would seem justi- 
fied since peripheral muscle mechanics offer 
inadequate explanation in most cases. 


SUMMARY 


The validity of certain Bender deviations 
as indicators of cerebral dysfunction was ex- 
amined in 100 child psychiatric patients, aged 
10 to 12. On the basis of their presenting 
complaints, patients were divided into two 
groups designated as suspected brain dam- 
aged and as suspected emotionally disturbed. 
Mental age and socioeconomic status factors 
were controlled. 

An a priori selection of 17 attributes nor- 
mally not occurring after age 8 showed 10 
of these to differentiate the two groups at 
the .01 level. False positive “organic” signs 
in the suspected emotionally disturbed 0c- 
curred in but one discriminating attribute. In- 
tercorrelations between the 10 attributes hav 
ing the best discriminatory power showe 
them, in this sample, to have in general low 
positive correlations with one another. ie 

These data suggest that the practicing Ch 
nician would do well, when these 10 attributes 
appear in the records of his patients, to eo 
sider “neuronic” rather than neurotic etiolo 
for the child’s behavior. For the researcher, 5 
is hoped that these findings may serve ea 
building block toward spelling out the Re 
geneous nature of that group now referre 
categorically as brain damaged. 
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A COMPARISON BETWEEN HYPNOTICALLY INDUCED 
AGE REGRESSIONS AND WAKING STORIES 
TO TAT CARDS: 


A PRELIMINARY REPORT 


JOSEPH REYHER 
Michigan State University 


Estimating the degree to which a client in 
therapy has worked through important areas 
Of conflict often imposes a difficult judgmen- 
tal task upon the therapist. A method for 
Mereasing the objectivity of this task was 
developed in connection with an exploratory 
Mvestigation involving the comparison of 
hypnotically induced age regressions to TAT 
Cards as stimuli and subsequent waking stories 
to the same cards, The original intent of the 
Study was to develop a procedure for reduc- 
ing the artificiality of hypnotically induced 
Conflict so the results of such data could be 
More meaningfully interpreted; however, the 
immediate diagnostic significance of the data 
Overshadowed some of the more long range 
experimental goals. It was found that differ- 
ences between the content of the hypnotic 
and waking reactions significantly extended 
diagnostic impressions, often reflected unre- 
Solved areas of conflict, were a useful psycho- 
herapeutic aid, and provided valuable in- 
Sights into hypnosis itself. Psychoanalytic 

Neory served as the frame of reference for 
th the psychotherapy and the experimental 
esign, 
METHOD 
Su — MET 
t Five subjects were used. Because of the possibility 
At adverse posthypnotic reactions might occur aS 3 


Teg ; ; a 
alt of the stimulation of a subject’s emotional con 


lc i 

t a only clients were used who were mM he a 
iy. gage in psychotherapy. Furthermore, pa ns in 
n some control over their emotional reactia 


© Waki $ the possibility of 
Dre hg state in order to reduce ee aolre 


ts were able 


ere char 


4 hre n ere 
Þredomiį ree of the subjects W the other two 


Mantly neurotic reactions; 
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were characterized by predominantly psychotic re- 
actions, but they were sufficiently intact to function 
without hospitalization. 


Procedure 


Since the same TAT cards were not suitable for 
both sexes, two sets of 17 cards each were assem- 
bled.t Eight of the cards were common to both sets. 
A random procedure was used to determine, for each 
subject, the selection of 10 cards from the appropri- 
ate set, the designation of the cards as either con- 
flictual (c) or neutral (n), and the order in which 
the cards were presented.* 

Conflict cards. While hypnotized, the subject was 
told that he would be asked to look at a picture 
that would activate disturbing emotions. He then 
was asked to open his eyes and look at the card. 
After about 10 seconds, he was instructed to close 
his eyes and to go back in time to a period when 
these emotions were very difficult to manage. He was 
then asked to verbalize his experience. 

Neutral cards. The instructions were the same as 
for the c-cards except that the subject was told that 
the emotions to be experienced would not be dis- 
turbing but, nevertheless, would be meaningful. 

Posthypnotic suggestions. All subjects were given 
the following instructions: 


Sometime later, when you are awake, Dr. A. will 
give you the same pictures that I gave you ear- 
lier, and he will ask you to tell stories about them. 
Each picture will stir up the same feelings, emo- 


tions, and ideas that it did before, but you will 


aiia 
1 The cards unique to the female set were 3GF, 


18GF, 13G, 12F, 9GF, 7GF, 8GF, 17GF, and 2; the 
cards unique to the male set were 8BM, 18BM, 20, 
14, 6BM, 7BM, 12BM, 13B, and 1; the cards which 
were common to both sets were 5, 10, 4, 3BM, 12BG, 
6GF, 13MF, and 17BM. Cards 15, 16, 11, 9BM, and 
19 were not used. 

2There was one restriction on the random pro- 
cedure which was used for determining the con- 
flictual and neutral designations of the cards. The 
structure of card 13MF was too extreme in a con- 
flictual direction to risk giving it a neutral desig- 


nation. 
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either reveal them directly or indirectly in the 
stories that you tell. 


The last sentence in these instructions was intended 
to give the subject control over the nature of what 
would be consciously experienced in order to reduce 
the possibility of a traumatic reaction to the pre- 
mature recognition of repressed material. 

Waking condition. The subjects were given the 
cards by Dr. A. with standard instructions, 


RESULTS 


All subjects produced stories in the waking 
state that varied in the extent to which they 
co?sresponded with the hypnotic reactions to 
the ©? same cards, In order to evaluate objec- 
tively?! the degree of similarity between the two 
sets of c ‘lata, the reactions to each card for the 
two conl@4itions were compared in terms of 
three of tahe most obvious dimensions of simi- 

larity whTich Tepression may influence: char- 
acters, situations, and affective-motivational 
state. The’ degree of similarity for characters 
(C) and sit’rations (S) was assessed by a four- 
point ratiti scale with the following descrip- 
tive label, and numerical values: Personalized 
(0), COreruent (1), indeterminant (2), and 
different í (3). The affective-motivational state 
(AM) twas quantified by counting the expres- 


sions Se intentions, needs, and affects by the 
subjegt in the hypnotic condition (H) and 
y 


Y the corresponding character in the post- 
hypnotic condition (PH). The AM units in 
the latter condition were classified as either 
the same (PH,) or as different (PHy) from 
the hypnotic condition. Each AM unit was 
Counted only once in each of the two condi- 
tions, Tegardless of how often it may have 
appeared. 


A difference score (D) for each card was 
devised: 


D=C+S+aAm 


where: 


a more com- 


plicated process than simple repression, 
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and high scores 
those which were 
tiles, Tespectively, 
An estimate of the interrater reliability for 
two raters was obtained from rank order co- 
efficients of Correlation between D scores for 
two of the Protocols that were selected ran- 


were arbitrarily defined as 
in the first and fourth quar- 


The following Protocol 


and n-cards which had high 
d B were similar inso- 


Subject A, Card 4 
Hypnotic re 


S. So Many times 
or give him a big kiss 
m afraid he wouldn’t— 


wanted h 


he stil 
» and I almost hated him 


d their day ht r 
cm. It seemed he po ters and took an inter: 


Just let me go my own way 
to T gather ta ke care o me and T dide want her 
would—T do qi tO- He just Wouldn't, but yet I 

4 o things for him, Like Pq zo golfing with 
e beer with him during the 

ot night—to please him, but 
Was to please him, not ‘cause 

ut I can’t it Seems like T wish I could. 
Oh, ahh let’s say, I believe 
Wo People—and he’s the 

: Nan at Would loye to tell his wife 
: S er but he can’t quite and she al- 
nóthin Vants him. to tell her all those 
cies and he finds it hard to do 50, 
understand one another 


° Console him, but she can’t because 


erstand it, 


B= 


È 
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These reactions reinforced the initial psy- 
chodynamic impressions concerning sexual in- 
clinations toward her father. 


Subject A, Card 8GF (n-Instructions) 


Hypnotic reactions. Im watching my brother and 
he’s painting, and I admire him so much. I think he’s 
tind of like a genius or something—the way he can 
create Pictures out of paint. I couldn’t do that and 
m kind of jealous of him. Still, he was my brother 
and I thought he would make something of himself 
and I could be his sister. He used to make the most 
beautiful scenes. He couldn’t paint people, just 
Scenery and things, They were so pretty. 

Vaking reactions. This girl really has a dreamy 
Ook in her eyes. She is in a classroom and she 
admires her—oh—English teacher very much. She 
thinks to herself that someday she will be just ex- 
actly like her, Is that short enough? That is, I 
mean, does it make any difference how long or short 
they are? 


Her underlying attraction to her brother 
comes very clear. Substitution of the word 
Wife” for her illogical use of “sister” would 
© more congruent with unacceptable fantasies 
Concerning her attraction to her brother. At 
nO previous time had there been any evidence 
that homosexual tendencies might be involved. 

© waking reactions indicated that these 
tendencies may be in the service of defense 
against incestuous fantasies. The obvious de- 
fensive disturbances in the waking reactions 
reinforce these impressions. 


te, 


Subject B, Card 7GF (c-Instructions) 


Hypnotic reactions. I am thinking about two things 
fie I can’t separate them. (What are they?) It seems 
at I'm sitting on my mother’s lap and she is telling 
© about having my first menstrual period ands 
‘lso seem to think at the same time about being tie 
Į ê chair, I can't—I can’t quite remember, I aad 
Tacs to get tied in the chair when I was bad an 
On’t know why. ‘ 
aking aiie The mother is reading a vac 
in pie little girl—and the little girl is holding 3 An 
er arms—and tells her that someday she that 
hee a mother—and be the kind of a Pow is 
sh, Mother is reading about in the book. to grow 
u © feeling?) She feels as if she would kei sad for 
2nd it seems a long way off. (How yal ai) 
he will grow up and have a baby- 


: i ces- 
In Subsequent therapeutic sessions, suc 


AG reams were induced in the gabe “i a 
Sarding the emotions and thoughts behin 


ima A ’s Jap learning 
Page of sitting on her mother’s lay g 


; ie 
Hot Menstruation and the image of ee 
€d to a chair. The induced dreams 


struation were related to psychosexual con- 
fusion which progressed to an abreaction of 
her anxiety regarding her wish to be a boy. 
The image of being tied to a chair was re- 
lated to a traumatic early childhood experi- 
ence in which she heard a voice, while she 
was in bed, telling her to kill her mother. 


Subject B, Card 17GF (u-Instructions) 


Hypnotic reactions. My daddy and I used to ride 
on bicycles through the—part of the canal—and it 
was very enjoyable for me—by the water and a 
bridge. I always enjoyed it—riding on a bicycle there 
with my dad, because we always liked the water— 
threw stones in it. (Is there anything else?) I—used 
to like it, that’s all. 

Waking reactions. It seems to be a granary of some 
sort—there are sacks of grain on the ground. There's 
a girl on a bridge. In the background is the granary 
and she would like to leave it, but she can’t make up 
her mind whether she wants to or not. I think she 
would like to be on a boat and go away. (What will 
finally happen?) I don’t think she will. (Can you 
make up a story about why she is torn between 
leaving and staying?) She feels that the people in 
the house will miss her if she goes away. She would 
like to go somewhere else. 


The material produced in the hypnotic re- 
action was related to the many dreams she 
had concerning water. Successive dream in- 
duction produced the reliving of an experi- 
ence in which she was riding on her father’s 
back in the water while having thoughts that 
she wanted to possess him completely. 

Subject C was a tense, conscientious, mar- 
ried female who was successfully engaged in 
one of the medical specialties. Marital difficul- 
ties constituted the presenting problem. 


Subject C, Card 17GF (c-Instructions) 


Hypnotic reactions. This is my fear of high places 
for myself or to see someone else in a high place, 
even in the movies or TV, when they're out on a 
ledge. When we were at the Grand Canyon, other 
people would lie on their stomachs and would look 
over the ‘edge. Why, I don’t know, unless heights 
symbolized to me jumping and suicide—trying to 
stay away from the situation or avoid it. Well, it’s 
not so much fear of jumping myself as an accident, 
of someone not meaning to but accidentally pushing 
someone over or falling. Well intentions and yet un- 
avoidable. I have thought briefly in terms of suicide 
but whether I would ever be able to commit suicide, 
I decided that I would never be able to. (Anything 
else?) No, I don’t, believe so. 4 

Waking reactions. A woman on a high bridge feel- 
ing very despairing, uh—considering jumping. She 
feels that things haven’t gone as they should but she 


isn’t going to jump. 
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The italicized portions reveal the underly- 
ing hostile impulses which were betrayed in 
terms of rather direct verbal representations, 
The attempts to cover up were ineffectual, 
and defensive reactions due to the marked 
breakthrough of murderous impulses were not 
present. Any such slips in the waking state 
probably would have activated a host of au- 
tonomic reactions and vigorous defensive ac- 
tivity. The brief and impoverished waking 
reaction supports this interpretation. No high 
D scores were associated with n-cards. 


Discusston 


In view of the omnipresent possibility of 
artifact due to the motivation of hypnotized 
subjects to please the experimenter by behay- 
ing in a manner consistent with what they 
believe is expected of them, caution must be 
exercised in the interpretation of the results, 
In terms of the Procedure, it would seem 
likely that subjects would Perceive that the 
instructions to the c-cards were of special in- 
terest. If subjects had surmised Correctly that 
the experimenter had expected greater differ- 


achieving 
a more sharply defined and delineated psycho- 
dynamic impression. 

The enhanced clarity of the Psychodynam- 
ics to c-cards with high D scores supports 
White’s (1941) observation that hypnosis is 
an altered state of awareness in which there is 
a contracted frame of reference and lack of 
such critical functions as a sense of humor 
and self-consciousness, It may be that a re- 
duction in these functions proportionately re- 
duces the anxiety producing potential of re- 
pressed intrapsychic stimuli which enables 
them to become more clearly represented in 
awareness. The most perplexing and unex- 
pected data, however, were the high D scores 
for a number of the n-cards to which the sub- 
ject had experienced positive affect in the hyp- 
notic reactions. Previous psychodynamic im- 


Joseph Reyner and Donald Shoemaker 


pressions indicated that this was not a case 
of simple forgetting, because the high D scores 
involved material related to the most central 
areas of conflict. Unlike the c-cards, material 
of equal or Sreater anxiety producing poten- 
tial was reacted to with positive affect. The 
two sets of instructions appeared to elicit dif- 
ferent kinds of unconscious processes or dif- 
ferent facets of the subjects’ conflicts, 

The observed differences in the reactions to 


quired the subject to react with disturbing 
affect to whatever was brought to mind by the 
card. In response to this, the subject seemed 
to recall some 
that was distressing to him. When this reac- 
tion was experienced, 


tate of awareness, and 
would activate anxiety 
s interpretation is con- 
Psychoanalytic theory in that 


client is tek any conscious effort. a 

awakened. „, ttucted to talk until he 
ened, at which time he is simply aske¢ 

to talk about his reactions. The suspension ° 


i 


al 


Age Regressions und Waking Stories in TAT 413 


Conscious effort and talking are designed to 
reduce critical functions (secondary process) 
that would be activated by the need to com- 
municate. Although our experience with this 
method is still limited, the results have been 
very encouraging. When the spontaneous im- 
ages, fantasies, dreams, and memories which 
are produced become too transparent, a de- 
fensive posthypnotic amnesia may occur; 
however, subsequent hypnosis frequently can 
recover much of this material. 

Cards 8GF and 7GF for Subjects A and B, 
respectively, illustrate that alterations of af- 
fect and motivation can masquerade as health. 
Most of the waking reactions, however, con- 
tained some clues as to the presence of un- 
derlying conflicts, but the alterations often 
Were so great that the nature of the actual 
Conflict could not be readily inferred without 
benefit of the hypnotic reactions. The most 
Productive method of analysis with these data 
Was to consider the hypnotic and waking re- 
actions jointly, The hypnotic reactions were 
Often closer to conflictual material, whereas 
the waking stories gave a better picture of de- 
fensive reactions. Thus, successive dream in- 
duction and a combined analysis of both types 
of TAT data present a better opportunity to 
assess the nature of the underlying anxiety 
Producing processes and attendant defensive 
Feactions at three different levels of psychic 
representation, 

_Not only do high D scores provide valuable 
diagnostic information about the status of the 
Subject’s conflicts and the degree to which he 

as worked them through, but they sometimes 
Open up areas previously unknown to the 

erapist. One method of utilizing the high D 
Scores in therapy is to give the subject the 
Posthypnotic suggestion to recall all the age 

“8ressions, with the provision that he can 
treet what he does not want to joa eed x 

Is ti sressions to the car 
With high a pa = a ee recalled by the 

8 scores are A 
urther evidence 
the material is 
ecome the foci 


op ject in the waking state, f 

Ohta le dynamic significance of 
ained., These areas then becor 
Subsequent hypnoanalytic sessions. 

mep S Not surprising that the subject may ba 
mber some of these age regressions 1n 1e 

saking state because the hypnotic material is 

relatively disguised. If subsequent waking 


a aK 
pal Ypnotic interviews are unproductive M 


further uncovering and working through of 
the material denoted by a given high D score, 
specialized hypnoanalytic procedures can be 
used. One such procedure which is particu- 
larly well suited for this purpose is successive 
dream induction concerning the emotions and 
experiences that produced the content of the 
age regression. Examples 7GF and 17GF for 
Subject B illustrate this procedure. 


SUMMARY 


The clinical and experimental value of hyp- 
notic and waking reactions to TAT cards was 
investigated. Ten TAT cards were selected 
randomly to be conflictual or neutral for five 
psychotherapy clients who were capable of 
deep hypnosis. The conflict-inducing instruc- 
tions were intended to activate disturbing re- 
actions to TAT cards. This was followed by 
an age regression to a period earlier in life 
when the activated reactions were particularly 
difficult to manage. The nonconflict-inducing 
instructions were the same except that they 
were intended to activate nondisturbing, but 
meaningful, reactions. 

A difference score (D score) was devised in 
order to evaluate quantitatively differences in 
characters, situations, and affective-motiva- 
tional states between the hypnotic and wak- 
ing reactions to each card. Both conflict cards 
and neutral cards had high D scores which 
reflected areas of conflict that had not yet 
come up in therapy and areas that had not 
been worked through adequately. 

Neutral cards with high D scores, unlike 
conflict cards with high D scores, were asso- 
ciated with positive affect. It was concluded 
that the conflict cards activated previous con- 
scious, conflictual experiences which more 
clearly revealed underlying repressed mate- 
rial than the waking reactions to the same 
cards. The high D scores to the neutral cards 
were interpreted as evidence that hypnosis is 
an altered state of awareness in which uncon- 
scious drives tend to be perceived in terms of 
gratification rather than threat. A procedure 
was described for utilizing this material in 


psychotherapy. 
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Several investigators have suggested that 
the degree to which a person’s speech departs 
from its usual level of coherence and economy 
is a likely indicator of anxiety (Dibner, 1956; 
Eldred & Price, 1958; Mahl, 1956). The fol- 
lowing experiment was designed to evaluate 
the predictive validity of measures of such 
speech disruption as indicators of anxiety. 


MEASURES 


Both Mahl (1956) and Dibner (1956) de- 
scribe measures of speech disruption. The two 
Measures appear similar and produce highly 
correlated total scores: the median for 15 
cases was .91 (Krause, 1961a). We felt that 
these measures could be improved with regard 
to their “psychological meaningfulness,” reli- 
ability, and range of sensitivity. In the first 
place there was a good deal of overlap be- 
tween the two measures. Mahl’s “sentence 
incompletion,” “repetition,” and “stutter” 
(plus “omission”) categories resemble Dib- 
ner’s “unfinished sentence,” “repeating words 
or phrases,” and “stuttering or unfinished 
words,” respectively. We attempted to re- 
group the various speech disruptions into 
categories which referred more uniformly to 
the expression of complete thoughts and which 
might be interpreted in terms of the function 
served by the disruption in the flow of speech. 
To these ends we adopted Dibner’s “break,” 
Mahl’s “correction,” their common “unfin- 
ished” or “incomplete,” and pooled their 
“repetition” and “stutter” categories. Mahl’s 
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“tongue slip” and “omission” and Dibner’s 
“omission” (part of his “stutter” category) 
we combined into “distortion,” Dibner’s “I 
don’t know,” “sigh,” “laugh,” and “ques- 
tion,” as well as Mahl’s “intruding incoherent 
sound,” were pooled in our “intrusion” cate- 
gory. The Mahl “Ah” category we expanded 
as “procrastination.” The following set of de- 


scriptions specifies our Categories of speech 
disruption, 


SPEECH Disruptions 


B (break). The continuity of one line of 
thought is broken by the intrusion of another. 
The break is usually grammatically disruptive 
but it is the interruption of an incompleted 
line of thought by another, different thought 
that is distinctive, There must be an actual 
shift of topic, so that the interrupted mate- 
rial seems to have been displaced from the 


speaker’s attention, Therefore a B is not 


scored when the interjected material is a com- 


mentary (even if rather anticipatory) upon 
or a correction of the interrupted material. 
Thrown-in Statements, references to the in- 
terviewer, and questions can be breaks. 

C (correction), Something is stated and 
then corrected within the same sentence. It 
may be a word, phrase, or clause which the 
correction is to replace, but in all cases the 
subject of the Corrected and correcting mate- 
rial seems to be the same. Thus, the correc- 
tion is usually one of pronunciation, empha- 
sis, grammar, form, degree, specificity, or fact- 
It must not modify nor explain but replace 
the corrected material. The subject’s concep- 
tion seems to be that he has said the wrong 
thing, has misspoken himself, and must replace 
his word or words. A clue to this is that both 
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corrected and correcting elements fit the con- 
text. 

F (fragment). A clause or sentence is left 
incomplete either in meaning or grammar. 
Whenever such a fragment occurs it is scored 
an F, even if it is apparently completely ex- 
pressed in another independent try. Thus, 
when a sentence is interrupted by a qualify- 
ing clause and then the interrupted portion is 
repeated or replaced, this portion is a frag- 
ment regardless of how similar are the origi- 
nal and resumptive materials. An intended 
Clause interrupted and replaced by another is 
an F, although the larger sentence into which 
both the interrupted and interrupting clauses 
fit is not an F. Answers to questions must be 
judged as to whether they are adequately com- 
Plete responses so not judged independently 
of the question. When the material following 
a possible F could be construed to complete 
it, continuity or discontinuity of intonation 1S 
a clue, for an F often involves a change be- 
tween the fragment and what follows it. 

D (distortion), Mistakes or distortions of 
Proper speech occur involving either grammar 
or meaning. They may be unintended words, 
Including neologisms and tongue slips, incor- 
tect words or grammatical errors which are 
not habitual, or mispronunciations. ‘These 
distortions are recognized by their imp 
Þriety, prima facie or as inferred from d e 
Context (often from the correcting materia J: 

ord omissions are scored Ds where without 
the ostensibly omitted word neither portion 
Of the statement (before or after the omis- 
Sion) would stand as a complete sentence Or 
thought, 

R (repetition). Certain 
Parts of ae = perseverated upon by aa 
tition rather than prolongation. The ee 
Speech is impeded by redundant elements $ n 
as Stutters, word or phrase repetitions, EA 
changes between contracted and acora ue a 
„orms, Repetitions share certain ene 
‘tics with the repeated material i 
Mguish their perseverative quality E 

aracteristics may involve pitch, eee 
Oudness, The paradigm for R 1s the Ta B 

(procrastination). The speaker see 


Bro, A -ø on to his next 
“Crastinate, to delay getting < of delaying 


oint, H ig by mean 
- He does this by TE 
Sounds (such âs “ah,” “um,” or prolonge 


phrases, words, Or 
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vowels), words (such as “well”), or phrases 
(“How shall I say?” sometimes even “I don’t 
know”). That he is controlling his speech and 
thinking ahead distinguishes procrastinations 
conceptually from repetitions. Many phrases 
which also serve other functions may partake 
of procrastination. These include “let’s say, 
let’s see, as a matter of fact, that is, for ex- 
ample, so to speak, I mean,” etc. “Well” and 
“Qh” are to be scored as procrastinations un- 
less they clearly lack this quality, as when 
they are purely exclamatory. 

I (intrusion). A nonverbal sound intrudes 
upon the flow of speech, but it occurs mean- 
ingfully like a break, rather than as a mere 
supporting procrastinating or background 
noise. It may be a sigh, laugh, cough, throat 
clearing, deep breath. 

The reliability attained by two independent 
raters on the experimental material, rating 
from typescripts with the recording being 
played, was 88%. This represents exact agree- 
ment, i.e., the number of disruptions scored 
identically as to location and category di- 
vided by total disruptions scored. Contained 
in the denominator are the number of disrup- 
tions scored by one and not the other rater, _ 
scored differently by each, and scored identi- 
cally by each. Over our protocols this coeffi- 
cient ranged from a low of 71% to a high of 
93%, with an interquartile range of about five 
percentage points. One-fourth of the protocols 
were rated after a lapse of 1 month, but the 
interrater agreement did not diminish, In or- 
der that response scores might be comparable 
over individuals with different speech outputs, 
scores were defined as the number of instances 
of occurrence for a particular category di- 
vided by the total number of words in the 


response. 
EXPERIMENTAL DESIGN 


The crucial design problem in this study 
was to provide anxiety criteria against which 
to test our measures. We employed a two-part 
criterion for recognizing instances of anxiety 
and nonanxiety. First, we developed a set of 
descriptions of what seemed stressful and non- 
stressful situations (i.e., stimuli). Second, we 
questioned the subjects as to what their feel- 
ings were while they were describing their 
probable reactions in these situations. Those 


er 
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verbal responses to the presumably ace 
situations which were reported to involve “a 
major element of tension, ees oE PA 
rehension . . . fear or fright” or anxiety, 
Ha no other emotions, were used as criterion 
instances of anxiety; while the responses to 
presumably nonstressful situations which were 
reported to involve no such element or any 
other emotion, were used as criterion instances 
of nonanxiety. We set aside, as irrelevant for 
validation purposes, the data on responses in 
which a stressor invoked no report of a fear 
or anxiety feeling, a nonstressor did invoke 
such a report or either invoked the report of 
some other emotion. 
These criteria were used because they rep- 
resented, in our opinion, the best evidence 
of anxiety that could be collected under 
the circumstances, ‘The argument upon which 
this opinion rests is presented in detail else- 
where (Krause, 1961b), also see Grinker, 
Korchin, Basowitz, Hamburg, Sabshin, Persky, 
Chevalier, and Board (1956). Briefly, certain 
stimuli are on the face of them more stress- 
ful than others. Our confidence that anxiety 
is produced by them is greater than that it 
is produced by apparently innocuous stimuli. 
Our rationale for using the introspective re- 
ports, the second part of our criterion, is that 
one cannot feel afraid or anxious and yet not 
be subject to this emotion, f 
Sufficient evidence of the e 
It is possible, however, for 
without feeling anxious, thus if a nonstressor 
which induces no feeling of anxiety does pro- 
duce speech disruption, the effects of predic- 
tive invalidity and “unconscious anxiety” will 
be confounded. The difficulty this involves 
for us is that we may get considerable speech 
disruption on trials classified as nonanxiety 
trials. This would tend to diminish the differ- 
ence in speech disruption between these con- 
trol trials and the experimental (anxiety) 
trials, thus decreasing the sensitivity 
experiment. 

The experiment was performed, ostensibly, 
as a survey of how people react in disasters, 
under the title Disaster Relief Research. The 
subjects were recruited by means of a letter 
aimed at arousing both their interest and 
public spirit. It was sent to a systematic ran- 
dom sample (Cochran, 1953, pp. 160-168) of 


or the feelings are 
motion’s presence, 
one to be anxious 


of our 
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University of Michigan students drawn from 
the summer directory. Only 19 of the 60 re- 
cipients volunteered to act as subjects, but we 
have no reason to believe our results biased by 
the subjects’ self-selection. 

Reinforcing the impression created by the 
letter, an introduction was given to each sub- 
ject before the experimental session. The study 
was presented as an attempt to discover what 
public reaction to disasters would be. We 
stressed the practical value of such knowl- 
edge, its scarcity, and the subject’s ability to 
provide it. The interviewer endeavored to 
make the session a collaboration in which the 
subject did his best to place himself in the 
situations described and spontaneously report 
his reactions, while the interviewer asked for 
elaborations or clarifications. This “cover” 
was essential to promote the subject’s involve- 
ment, without which the stressors would be 
ineffective, and his honesty in reporting feel- 
ings, without which we could not assert any 
correspondence between feelings and reported 
feelings. The control achieved by our cover, 
like any control upon which it is infeasible 
to collect independent data as to its efficacy, 
is open to question, of course, but we have no 
reason to suspect that the subjects did not 
believe and report as we intended, Their in- 
volvement is evident in the recordings of their 
sessions, 

The greatest danger of subjects’ dishonesty 
biasing our results is in their possible tend- 
ency to report Conventional, socially accept- 
able feelings. We were especially concerned 
about reports of anxiety feelings’ presence or 
absence in response to what the subjects sup- 
posed the interviewer would expect, rather 
actual feelings. The 


danger, discomfort, oF 
osed to be involved. 


? The existence of su 
sponse-set is questionab’ 
reports of feelings (ie. 


One collateral finding is interesting in this regard. 
The unconventional responses of the 11 females ÍP 
our sample show a surprising regularity. Those oe 
Scored over 12 on a 50-item sample of the Manifest 
Anxiety scale err only in reporting anxiety to nor- 
stressors, while those Scoring under 12 err only in r 
Porting nonanxiety to stressors. The degree of t 


E ee 


= 
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This would tend to give us convervative re- 
sults in the predictability of anxiety by speech 
disruption, i.e., any positive findings will tend 
to be understated. 

We selected 10 pairs of situations, a stressor 
and a nonstressor in each. They all had some 
Televance to disasters and ranged from a first 
aid training proposal and Conelrad, through 
frustrated attempts to help others, to situa- 
tions of great personal danger. They were 
Paired in an attempt to minimize the differ- 
ences, other than stress, between each stressor 
and its nonstressor mate. Thus, their posi- 
tions in the series of 20 were contiguous and 
the degree and manner in which the response 
Was Structured by the situation and probe 
questions were equated within pairs. A greater 
Similarity of within pair subject matter would 
lave been desirable as well, but for the greater 
canger of Overlap between the emotional ef- 
“cts of stressor and nonstressor. The inter- 
Pair order was arranged so that the subjects 
Would not Perceive a stressor-nonstressor pat- 
tern and so that the apparently more intense 
Stressors did not have a carryover effect on 
Other stressors, 

In Summary: We exposed each of 19 sub- 
Jects to 10 pairs of stimuli presented in a 
Standard order. The subjects were under the 
“Mression that they were participating in a 
rvey of how people might be expected to 
Cel anq react during disasters. Those of their 

ats of responses which met our validity cri- 

à Were studied in terms of a measure of 
“eee disruption for the relationship between 
Xlety and this measure. 


RESULTS 
suntitty-seven pairs of responses over Ms 
sun CctS (from a total of 190 pairs over l 
re, Cts) were appropriate for study. We were 

atively ineffective in controlling the stress- 
kno 5 Of our stimuli, because we lacked ar 
{Wedge of the subjects and used a stan 


a 
Wor t of stimuli for all subjects. Some oe 
the Lazarus (1960) may prove helpful in 


ture selection of stimuli. eni 
Were © Seven categories of speech disruption 


iy Mividually scored for each of the 37 
‘er 


Tors $ 5 5 re, r= 0.77 
(ienige Varies Positively with MA scale score, 


t at the 01 level). 
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TABLE 1 


SPEECH DISRUPTION CATEGORIES AND Percent Cor- 
RECT, ĪNCORRECT, AND INDETERMINATE 


Category 


Prediction I F R P e p B 


Correct 78* 59 54 54 4&3 24 5 
Incorrect 14 358 43 46 38 11 0 
Indeterminate 8 3 3 3 19 65 95 


* Significant at the .05 level, one-tailed, 


response pairs. If the sign of the difference 
(anxiety response score minus nonanxiety re- 
sponse score) was positive the category pre- 
dicted correctly for that pair. If the sign of 
the difference was negative, it predicted incor- 
rectly. And if there was no difference, it failed 
to discriminate. The differences for each cate- 
gory, alone and in combination with others, 
were tested for significance by the normal ap- 
proximation to the binomial for the differ- 
ence between two percentages. The results in 
Table 1 are the percentages obtained for each 
of seven categories viewed separately. 

By combining categories we can obtain a 
slightly higher percentage of correct predic- 
tion than afforded by any individual category 
alone. The best simple combination seems to 
be to use I, and where it fails to discriminate 
use F. This combination predicted correctly 
in 86% of the cases and incorrectly in 14% 
of the cases. 

Both Mahl and Dibner have used the sum 
of several categories for a speech disruption 
score. We can approximate their scores by 
combinations of our category scores. The 
Mahl non-Ah ratio (Mahl, 1956) is akin to 
the sum of all our categories excluding I and 
P, while Dibner’s Cue-Count 1 (Dibner, 
1956) resembles our sum minus only I. 

Despite the intended similarity between the 
Mahl non-Ah ratio and our sum of categories 
excluding I and P, it should be pointed out 
that our approximation of this ratio discrimi- 
nated correctly in only 57% of the cases. A 
somewhat higher figure was reported by Mahl 
and Kasl (1958), in a similar experiment, for 
the non-Ah ratio suggesting the possibility 
that the scoring criteria were not identical. 
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TABLE 2 


COMBINATIONS OF SPEECH DISRUPTION CATE- 
AND PERCENT CORRECT, AND INCORRECT, 
INDETERMINATE 


SOME 
GORIES 


Combination 


Prediction Total Non-I Non-I & Non-P 
Correct 70* 62 57 
Incorrect 30 38 40 

0 0 3 


Indeterminate 


* Significant at the .05 level, one tailed, 


It does not seem that the extra effort re- 
quired to score the categories other than I 
(with the possible exception of F) is war- 
ranted by their incremental effect upon va- 
lidity. It is interesting, however, that speech 
disruption categories as a whole are, when 
pooled, statistically significant indicators of 
anxiety. This suggests that speech disruption 
may be a useful (unitary) concept. 

In order to give the reader some basis for 
comparing the validities of the disruption 
measures with those of other verbal anxiety 
measures, we include the results on three 
other suggested measures for which it was fea- 
sible to score our data. These are number 
of words spoken in the subject’s response 
(Balkan & Masserman, 1940), the verb-adjec- 
tive ratio (Balkan & Masserman, 1940), and 
the latency of the subject’s response (Benton, 
Hartman, & Sarason, 1955). The results, por- 
trayed in Table 3, emphasize the effectiveness 
of I (with or without F) as an indicator of 


anxiety. 


TABLE 3 


NONSPEECH DISRUPTION CATEGORIES AND PERCENT 
Correct, INCORRECT, AND INDETERMINATE 


Category 
Number of Verbs/ 

Prediction Words Adjectives Latency 
Correct 59 65 62 
Incorrect 41 35 24 
Indeterminate 0 0 14 


Merton S. Krause 


and Marc Pilisuk 


DISCUSSION 


The sorts of explanations alternative to 
anxiety which may be offered to account for 
speech disruption depend upon the situation 
in which the subject is placed. If we ask him 
to describe vague, ambiguous, or half forgot- 
ten experiences, then procrastinations, correc- 
tions, and fragmented sentences may well be 
frequent, If the interviewer is rather reactively 
mobile in his facial expression, this may in- 
duce a good deal of repetition or correction 
by the subject. An overly friendly or humor- 
ous interviewer can multiply intrusions of 
laughter, while one who nods his understand- 
ing anticipatorily may encourage distortions 
and fragmentation, If the subject does not 
perceive the situation as one calling for par- 
ticularly coherent, grammatical, task-oriented, 
and economical speech—as he might not if he 
did not consider his role a serious one or if 
the interviewer were an old friend—he may 
not try to achieve a high level of disruption- 
free speech. We attempted to avoid these 
sources of speech disruption by our choice of 
stimuli, instruction of the interviewer, and 
experimental cover. Furthermore, these ‘influ- 
ences tend to affect the absolute level of 
speech disruption and not its variations over 
the several stimuli presented, 

There are, however, peculiarities of our ex- 
perimental situation which may limit the gen- 
eralizability of our major results. The subjects 
were asked, in effect, to play-act. There were 
no “real” dangers in the situation perceptible 
by the subjects but for becoming excessively 
distraught by playing too well or offending 
the interviewer by getting too involved in 
the play and thereby “revealing” themselves- 
These dangers might be relieved by maintain- 
ing or shifting into a role-alien (e.g., an ob- 
serving ego) viewpoint and so achieving some 
perspective or “emotional distance” from what 
one was saying or thinking. The predominant 
component of I, laughter, was, as we define 
it, peculiarly well fitted as an indicator O 
such a stratagem. Where the subject became 
threatened, he may have laughed as an imay 
cation of his lack of commitment to the 0” 
going remarks, for one can most certainly di5- 
cern this subtle intrusive quality in much ° 
the subjects’ laughter. This is not to say th® 
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the subjects were not anxious then, but it does 
Suggest that in a more thoroughly real or 
spontaneous situation (e.g., using sudden in- 
tense stimuli as stressors) intrusive laughter 
might be more rare, as Bs were in this situa- 


tion, and so fail as a predictor of anxiety. 


SUMMARY 


In order to assess the predictive validity of 
Speech disruption as an indicator of transi- 
tory anxiety, we exposed 19 subjects to 10 
Stressors and 10 nonstressors. The combina- 
tion of stressor and reported feelings of anx- 
lety was the criterion of anxiety’s presence 
and that of nonstressor and reported absence 
e anxiety feelings was the criterion of anx- 
lety’s absence, We found that intrusive non- 
Verbal Sounds, mainly laughs and sighs, were 

© most correct predictors. 
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CHARACTERISTICS OF TERMINATORS AND REMAINERS 
IN CHILD GUIDANCE TREATMENT 


ALAN O. ROSS anp HARVEY M. LACEY 
Pittsburgh Child Guidance Center 


Caseload statistics at Pittsburgh Child Guid- 
ance Center indicate that 28% of all families 
accepted for treatment terminate their con- 
tact unilaterally and at a time which the 
clinic staff considers premature. This experi- 
ence seems similar to that of other clinics, for 
Levitt (1958) states that some report a drop- 
out rate of more than 30%. Premature termi- 
nation is costly because as many as 15 hours 
of scarce professional time may have been 
spent in diagnostic study and related activi- 
ties before the family decides to discontinue 
the contact. This poses a practical problem. 
At the same time these families are of theo- 
retical interest because, though carefully se- 
lected, they are unable to make use of col- 
laborative treatment, the traditional approach 
of child guidance clinics. 

Some families seem to derive benefit even 
from a truncated contact (Inman, 1956) so 
that the professional time may not be wasted 
completely, but if potential terminators could 
be identified early in the proceedings, they 
might be provided with a modified service, 
more suited to their needs. The staff time thus 
freed could then be made available to other 
families better able to use the traditional child 


guidance approach. 

This study was designed to explore vari- 
ables which might help differentiate between 
terminators and remainers. A number of in- 
vestigators have attempted to identify charac- 
teristics of families who remain in treatment 
and of those who terminate prematurely. 
Levitt (1958) found that judgments of cli- 
nicians, made on the basis of diagnostic in- 
formation, did not identify “defectors” suc- 


cessfully; nor did judgments of motivation 


for treatment and severity of symptoms dif- 
ferentiate between the two groups. 

Hofstein (1957) has pointed out that in a 
child guidance clinic, assessment of a child’s 
treatability must rest on the evaluation of 
the parents’ capacity to involve themselves in 
the treatment process and to work toward a 
change in their relations to each other as well 
as to their child. The important contribution 
of parental attitudes to continuance in child 
guidance treatment has also been stressed by 
Inman (1956) and Smigelsky (1949). 

Lake and Levinger (1960) found differences 
in parental attitudes when they compared 50 
continuers with 50 discontinuers. The parents 
of continuers tended to be more aware of the 
child’s disturbance and of their own con- 
tribution to it. They were more inclined to 
see the problem as something for which the 
family as a whole was responsible and ac- 
cepted that they themselves had to participate 
in finding a solution. They also displayed 
greater cooperation during interviews an 
tended to agree with the worker on the na- 
ture of the child’s disturbance. 

Levitt (1957) compared defectors with re- 
mainers on 61 variables and found 5 which 
seemed to differentiate between the tw? 
groups. The variables did not form a mean- 
ingful cluster, nor did there appear tO be 
“any theoretical reason to expect them tO be 
differentiating.” In addition, the probability 
of finding 5 analyses out of 61 significant a 
the .05 level is very nearly .25. These consi 
erations led Levitt to ascribe his results = 
chance. The present study attempted tO ai 
ticipate this dilemma by testing a priori p 
dictions made on the basis of logical 4 
theoretical considerations. 
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METHOD 


This center prepares an IBM punch-card record at 
the time each case js closed. The record is coded on 
67 categories, ranging from objective background de- 
scriptions of the child and the family, to items deal- 
mg with developmental and health history, present- 
mg symptoms, diagnosis, and clinic procedures.? The 
System includes every case closed since 1943 and, at 
the time of this investigation, contained 2,400 records. 
A study by Gilbert (1957) suggests that girls referred 
to child guidance clinics do not come from the same 


Population as boys. Therefore, only boys’ records 
Were used for the present analysis. There were 1,497 
Oys 


among the cases. Terminators and remainers 
drawn from this population. 

Terminators were defined as families who had en- 
tered treatment but discontinued therapy on their 
Own decision before five treatment interviews with 
the child had taken place. A total of 107 cases met 

ese criteria, 

Cmainers were defined as families who had en- 
tereq treatment, continued for more than 16 treat- 
ment interviews with the child, and terminated either 
te Mutual or clinic decision. A total of 154 cases met 

Se criteria, 

the 67 categories of information available on 
ach case, 27 were chosen for analysis. Those cate- 
an Were selected in which the information was 
asie, € enough for research purposes and relevant to 
o Pei of parental attitudes toward the clinic or 
ward the child. Specific predictions of the relations 
i Ween the variables and the continuance dichotomy 
ere made and recorded before analysis of the data. 
ot ad of the categories more than 1 prediction was 
it 50 that a total of 32 discrete predictions was 
Put to test 
report Predictions were based on results of studies 
Lake ed in the literature (Affleck & Mednick, ing 
Levitt Levinger, 1960; Levitt, 1957; Rubenstein & 
diction, 1957; Tuckman & Lavell, 1959). Orlign Be 
in tre, S Were based on parents? ability to participa : 
Clinic ‘tment, which seemed relevant on the basis o 
experience, 


Were 


RESULTS 


bias the 32 predictions, 9 could not be tested 
Meanie insufficient information precluded a 
to, asful analysis, These hypotheses related 
beharttile behavior, tomboy or effeminate 
ep, or, father’s employment status, moth- 
eng, Ployment status, nature of family resi- 
“atego and three others related to diagnostic 
Con “ries, Five of the 32 predictions were 

med at better than the .02 level of sig- 


1 

This . 5 
Dit System was instituted by William F. Finzer, 
Chrena of the center, and is maintained by Sallie 
Sisto © them, and to David S. Lepson, who 


ap Sted yu i 
iea With the analysis, the authors express their 
ciation, itii: 
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nificance. When 32 comparisons are possible, 
the probability of obtaining five differences, 
significant below the .02 level, by chance is 
less than .0001 (CR = 5.51). This calcula- 
tion is based on the approximation described 
by Brožek and Tiede (1952), 

For the remaining 18 predictions the re- 
sults failed to reach a satisfactory level of sta- 
tistical significance, although the differences 
were in the predicted direction in all but five 
of the comparisons. 


Confirmed Predictions 


Compared to terminators, remainers have a 
greater proportion of histories of develop- 
mental difficulties (complications in weaning, 
toilet training, delayed speech, reduced social 
responsiveness) (x? = 10.12, $ <01). 

The classification “Unusual Behavior” (con- 
fusion, disorientation, panic reactions, unpre- 
dictable, meaningless, and self-destructive 
acts) was found more frequently among the 
remainers than among the terminators (x? 
= 11.49, p < .001). 

There was a higher incidence of marital 
disharmony (excluding divorce and separa- 
tion) among the remainers than among the 
terminators (x° = 5.96, p < .02). 

Compared to terminators, remainers con- 
tained more cases where, in addition to the 
child’s individual therapy, both parents were 
seen individually for treatment. Among the 
terminators only one parent (usually the 
mother) tended to be in concurrent treatment 
(x2 = 15.25, p < .001). 

Termination was positively related to ob- 
taining clinic service immediately after ap- 
plication, without having to wait for the in- 
take interview; remainers tended to be fami- 
lies who had to wait for service (x? = 29.16, 
p < .001). 

In addition to the results which confirmed 
specific predictions, two findings for which no 
hypotheses had been formulated emerged as 
the data were analyzed. Not having been spe- 
cifically predicted, however, they are some- 
what less convincing than the results cited 
above. 

Compared to terminators, remainers have 
a higher incidence of specific somatic dis- 


2 df=1 unless otherwise indicated. 
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orders (asthma, eczema, stuttering) as op- 
posed to nonspecific somatic disorders (un- 
diagnosed pains, sweating, tension) (x? 
= 10.19, p < .01). 

When truancy was compared with other 
educational problems (reading and other 
learning difficulties, school phobia, school be- 
havior problems) it was found more often 
among terminators than among remainers (x° 
= 11.04, p < .001). 


Unconfirmed Predictions 


The following relations, although not sta- 
tistically significant, were in the predicted di- 
rection: The older the child, the greater the 
tendency to remain (¢ = .25). Families previ- 
ously known to other social agencies tend to 
remain (x? = .63). Families known to juve- 
nile court tend to terminate (y* = 1.27). 
Children with runaway problems tend to be 
among the terminators (x° = 2.08). Antiso- 
cial behavior is high among the terminators 
(x? = .07). The more advanced the mother’s 
education, the greater the tendency to remain 
(x? = 2.68). The lower the child’s intelli- 
gence, the greater the tendency to terminate 
(x? = .15). The middle income group, as op- 
posed to the low income and high income 
groups, tends to remain (x? = 3.79). 

The following comparisons were found to 
be in the opposite direction of predictions, 
but none reached a statistically significant 
level. It had been predicted that Negro fami- 
lies would tend to terminate (x? = .99); that 
fee paying cases would remain (x* = .005); 
that children with all types of somatic dis- 
orders would be among the remainers (x? 
= .19); that children with all kinds of edu- 
cational problems would tend to remain (x? 
= .95); and that the more advanced the fa- 
ther’s education the greater the tendency to 


remain (x? = .04). 
DISCUSSION 


Studies of abrupt termination in adult psy- 

' chotherapy have shown that a patient’s mo- 
tivation is one factor which differentiates the 
terminator from the remainer (Affleck & Med- 
nick, 1959). In child guidance therapy it is 
the parents’ motivation which is the deter- 
mining factor, but the parents must not only 
have the motivation to bring the child and 
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support his treatment experience, but they 
must also have the capacity to involve them- 
selves actively in the therapy program. Pre- 
mature termination tends to occur when 
either of these important conditions is not 
met. 

Length of time on the waiting list between 
application and intake seems to be one meas- 
ure of parental motivation. The parent with 
low motivation would not be expected to re- 
main in clinic contact if he had to wait for 
any length of time. When treatment is finally 
started the group still on the waiting list may 
be assumed to contain a high proportion of 
well motivated individuals. Thus, there is 4 
significantly greater proportion of waiting list 
cases among the remainers than among the 
terminators. There was a similar relation, al- 
though below the adopted level of significance, 
when the groups are compared on waiting 
from intake to diagnostic study (x? = 1.37) 
p> .20) and on waiting from diagnostic 
study to beginning of treatment (x? = 1-79, 
p> 10). While a waiting list is an unde- 
sirable feature of a clinic’s operation, delay 
does, in many instances, serve as a screening 
device which eliminates poorly motivated pa- 
tients before they get to the point of treat- 
ment. 

The comparison which bears most directly 
on the parents’ ability to involve themselves 
in treatment is that between families where 
only one parent was in concurrent treatment 
and those where both parents were being see? 
individually. When only one parent is in con- 
current treatment, it is almost always the 
mother. The results clearly show that whe? 
the father, as well as the mother, can be 3” 
volved in the treatment plan the chances me 
the case will terminate prematurely are great 
reduced. ed 

It has become more universally recognize 
that the father plays an important and 0 t 
crucial role in the treatment of childreg 
(Rubenstein & Levitt, 1957). The findik 
here reported lend strong support tO is 
trend. They suggest that when the father iy 
not in treatment, he may either ach o 
sabotage treatment efforts or, by Ree eit 
treatment experiences, materially 
the process. This may be especial 
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the treatment of boys, 
Present study, 

Parental motivation might be expected to 
be a function of the distress which the child’s 
Symptoms cause the parents. Hiler (1959), 
Studying adults in individual psychotherapy, 
found that patients who complain only of 
somatic Symptoms are likely to terminate, 
Probably because the symptom enables them 
to “bind” their anxiety. In child guidance 
stapy, the reverse seems to be true. The 
larre behavior of the mentally ill child and 
Such specific somatic disorders as asthma may 
a assumed to be distressing to the parent, 

‘© Would be very much aware of the dis- 
nitbance and motivated to obtain help. The 
“sults of relevant comparisons strongly sug- 
gest that the more apparent the symptom, 
te likely the case is to continue in treat- 


the subjects in the 


vop other indirect reflection of parental in- 
vement and motivation may be the fact 
Oties or tining was positively related to his- 
With 2f developmental difficulties. A child 
x Such a history may be assumed to have 
thae chological problem of long standing so 
or 'S parents might be more aware of it, 
Bik distressed by it, and thus more highly 
tion ted to remain in treatment. In addi- 
chil. Mother’s report of difficulties in her 
Nes S early years may represent her aware- 
5 of her own contribution to the problem, 
ort Wareness Lake and Levinger (1960) re- 
AS a factor in continuance. 
Woulg Confirmed prediction that remainers 
Marit Contain proportionately more cases with 
Ders al disharmony also bears on a parent’s 
Daren, o Volvement, The finding reflects the 
S ability to talk about a problem which 
S them. In addition, marital dishar- 
oie a distressing personal problem for 
ing Si S and one for which they might hope to 
On Ome relief through the clinic contact. 
Ubati € basis of studies reported from adult 
g. dt clinics (Hollingshead & Redlich, 
diny Ubenstein & Lorr, 1956), it had been 
> that certain sociocultural factors 
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= < .10), no difference between remainers 
and terminators in fathers’ occupational class: 
was found (y7 = 3.72, df = 5, b=<.70). 
Tuckman and Lavell (1959), who compared 
social status at all stages of clinic contact, 
also found that higher status patients were no 
more likely to maintain contact with the clinic 
than lower status patients. This would sug- 
gest that child guidance clinics are more suc- 
cessful than adult outpatient clinics in help- 
ing patients from the lower socioeconomic 
levels to remain in treatment, 

It has been reported that a higher propor- 
tion of the continuers perceive the child’s 
problem theniselves, rather than through the 
demands made by the community, and that 
they desire change in themselves, as well as 
in their child and in their spouse (Lake & 
Levinger, 1960). The present results are con- 
sistent with these findings. Terminators were 
more often referred by an authority, such as 
juvenile court or school, while remainers were 
more often referred by friends, social agen- 
cies, or themselves (x? = 3.77, p < .10). 

The incidence of truancy is significantly 
higher among terminators than among re- 
mainers when compared with other school 
problems. This result had not been specifically 
predicted. It seems that truancy is an expres- 
sion of pathological family behavior patterns 
where avoidance of anxiety-arousing situations 
takes the form of physical departure. A family 
which abruptly and unilaterally terminates 
clinic contact and a child who truants from 
school may be manifesting the same basic re- 
action to stress. 

While we were able to confirm a significant 
number of a priori predictions, this investiga- 
tion shares the weakness of all ex post facto 
research on closed cases. A specific fact only 
becomes a research datum if the patient re- 
ports it accurately, the interviewer records it 
fully, and the person responsible for coding 
codes it correctly. It is hoped that a projected 
long range study will not only overcome these 
difficulties but also address itself to the ques- 
tion of treatment outcome and to the prob- 
lem of parental attitudes toward the child 
which Lake and Levinger (1960), among 
others, have suspected to be an important 
variable in continuance of treatment contact, 
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SUMMARY 


Families who terminated child guidance 
contact before the fifth treatment session 
were compared with families who continued 
treatment for a minimum of 16 interviews. 
Remainers had significantly more develop- 
mental difficulties, unusual behavior, marital 
disharmony, and specific somatic disorders. 
They contained significantly more cases where 
both parents were in concurrent treatment. 
Terminators had significantly more school 
truancy, and they had less often experienced 
a waiting period between application and in- 
take. The results were discussed in terms of 
the importance of the parents’ motivation and 
their ability to involve themselves in the treat- 
ment process. 
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CHRONICITY OF NEUROPSYCHIATRIC HOSPITALIZATION: 
A PREDICTIVE SCALE 


JAMES M. ANKER 


Veterans Administration Hospital, Perry Point, Maryland 


This paper reports an attempt to predict 
the length of time that newly admitted pa- 
tients will stay confined in a neuropsychiatric 
ao Specifically, the primary concern is 
With the prediction of the relatively infrequent 
vent of chronicity. Although most mental pa- 
tients admitted are discharged as improved 
Within a reasonable period of time, long-term 
p europsychiatric chronicity has become a ma- 
Wr problem in the field of mental health 
(Giedt & Schlosser, 1955; National Commit- 
against Mental Illness, Inc., 1957). It is 
che rtant to note that the probability of dis- 
fae decreases markedly with the passage 
ac time so that, over the years, hospitals are 
n “umulating populations which are predomi- 
‘ntly chronic. “In the average state mental 
tpebital, about 15% of the patients have been 

ore less than a year; about 25% have been 
pave between 1 and 5 years; about 60% have 
i there from 5 to 45 years or longer” (Na- 
ise Committee against Mental Illness, Inc., 
Þitaliz. Despite the fact that durations of hos- 
Š a aion have decreased somewhat in re- 
oll years (Harris & Norris, 1954; Kramer & 
an “e 1957) the problem of identification 
ĉar] Special treatment of potential chronics 
maip © the course of their hospitalization re- 
len nS a most important and intriguing chal- 

Be, 
of Predictive studies dealing with the course 
thera mental patient, e.g., regarding psycho- 
liho, bY: Outcome of hospitalization, and like- 
duceq of rehospitalization, generally have pro- 
Ton, pge sults of low predictive value (Bar- 
Brigg 953a, 1953b; Bayard & Pascal, 1954; 

oe 1958; Cole, Swensen, & Pascal, 1954; 
nha Zubin, Mettler, & Pogan, ET 
$ Swen & Meltzer, 1946; Feldman, — 

en, 1954; Gallagher, 1954; Gildea 


Man, 1942-43; Orr, Anderson, Martin, & 
Philpot, 1954-55; Peterson, 1954a, 1954b; 
Schofield, Hathaway, Hastings, & Bell, 1954; 
Swensen & Pascal, 1954a, 1954b). Two stud- 
ies offer more promising results. Meeker’s 
(1958) exploratory MMPI scale to predict 
length of neuropsychiatric hospitalization ap- 
pears to have greater value but his limited 
samples somewhat obscure the results. One 
study using demographic data has been able 
to predict length of neuropsychiatric hospital 
stay (more than or less than 90 days) with 
77.2% overall accuracy (Lindemann, Fair- 
weather, Stone, Smith, & London, 1959). A 
discussion of common shortcomings and diffi- 
culties in prognostic research may be found 
in Zubin and Windle (1954). They also point 
out the unusual prevalence of contradictory 
and low-order results. 


PROBLEM 


Two basic problems presented themselves 
for study. Improving the means of identifying 
potentially chronic patients among those newly 
admitted appeared necessary. Such identifica- 
tion could result in new, or at least more in- 
tensive, treatment procedures for this group 
and, hopefully, shorten their stay. Secondly, 
it seemed advantageous to develop an identifi- 
cation procedure which would be meaningful 
in relation to personality characteristics of the 
patient. Rather than using demographic data, 
for instance, this information should be more 
capable of suggesting specific types of treat- 
ment procedures aimed at reducing chronicity. 
For these reasons a personality inventory was 
chosen to provide the basic data for the con- 
struction of a predictive scale. The MMPI 
was selected because data were available. This 
paper reports the development of the scale. 
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ORIGINAL SAMPLE 


Procedure. MMPI protocols at least 1 year old 
were taken from the records of the psychology serv- 
ice of a large Veterans Administration hospital. Those 
protocols that had been obtained over 2 months after 
the date of admission were not used as data. Further, 
because of their small number, the records of female 
patients were excluded. Two criterion groups were 
selected from these protocols; a short-term group 
which stayed 6 months or less and a long-term group 
which stayed 1 year or longer. The short-term group 
numbered 103 and the long-term group 63. The data 
from these dichotomous groups were item analyzed 
for every item in the MMPI item pool by chi square 
from a 2 X 2 contingency table. These computations 
were verified graphically by solving the chi square 
quadratic and generating an elliptical ABAC in the 
manner suggested by Andrews (1952). 


Results. Fifty-five different items differenti- 
ated the criterion groups at the .05 level or 
less, 33 items at the .02 level or less, and 17 
items at the .01 level or less. These frequencies 
exceeded those for a series of statistical tests 
at less than the .01 level (Block, 1960; 
Sakoda, Cohen, & Beall, 1954). The point 
biserial correlation between scale score (each 


TABLE 1 
COMBINED SAMPLE CHARACTERISTICS 


Diagnosis 
Schizophrenic reactions 204 
Affective reactions 
Psychoneurotic reactions 72 
Paranoid state 2 
Psychophysiologic reactions 4 
Acute brain syndromes 8 
Chronic brain syndromes 17 
Personality trait disturbances 13 
Personality pattern disturbances 6 
Sociopathic personality disturbances 16 
Transient situational personality disorders 4 
Unknown 5 
358 
Race 
Caucasian 309 
Negro 41 
Oriental 2 
Unknown 6 
358 
Age 
Mean 33.8 
Standard deviation 9.1 


—The total sample of 433 was reduced to 358 through 
a of short form and otherwise incomplete protocols. 
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item arbitrarily given a possible score of 1) 
and the dichotomous criterion of chronicity 
was .53 for all 55 items. The point biserial 
correlation between scale score on the items 
differentiating at the .02 level or less and the 
dichotomous criterion or chronicity was .63. 
These values are only approximations, how- 
ever, because of the discontinuity between the 
two samples. These original results were sub- 
mitted to cross-validation procedures. 


Cross-VALIDATION SAMPLE 


Procedure. MMPI protocols over 1 year old which 
were obtained within 2 months of admission were 
solicited from a number of Veterans Administration 
neuropsychiatric hospitals.1 Protocols were sorted 
into the same criterion groups as in the original sam- 
ple. The group with durations of more than 6 and 
less than 12 months was held for later analysis, These 
data were analyzed for those 55 items which dif- 
ferentiated the criterion groups in the original sam- 
ple. The short-term group in the cross-validation 
sample numbered 144 and the long-term group 123. 
The diagnostic, racial, and age characteristics of the 
original and cross-validation sample combined are 
presented in Table 1. 


Results. The cross-validating item analysis 
produced 21 items which differentiated be- 
tween the criterion groups at approximately 
the .05 level or less, 11 at the .02 level or less 
and 9 at the .01 level or less. These items 
and their combined probabilities (Lindquist, 
1940) are presented in Table 2.2 


SCALE Construction 


The 21 items thus selected comprised the 
basic scale. Data from the original and the 
cross-validation samples were pooled and items 
were weighted according to their discrimina- 
tory ability (Guilford, 1942). Only one item, 
number 35 in the MMPI booklet, warranted a 
weighted score of 2. The other 20 items in the 
scale were weighted 1. 

Protocols from the group that stayed be- 
tween 6 and 12 months were combined with 


1The author wishes to express his gratitude to 
William C. Hallow, Veterans Administration Hospital 
Lebanon, Pennsylvania; Burke Smith, Veterans 
ministration Hospital, Roanoke, Virginia; and Job? 
B. Marks, Veterans Administration Hospital, Ameri- 
can Lake, Washington, for their cooperation in SUP 
plying the cross-validation data for this study- 

*The list of the 55 items which discriminated i 
criterion groups in the original sample is availab 
from the author on request. 
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i TABLE 2 
/ Cross-Vatiparep SCALE ĪTEMS AND THEIR PooLep PROBABILITIES 
Number Pooled 
in MMPI 
Booklet Item Value 
16 I am sure I get a raw deal from life. (T) .0005 
| 20 My sex life is satisfactory. (F) .0005 
IN 35" If people had not had it in for me I would have been much more 
h successful. (T) -0005 
42 My family does not like the work I have chosen (or the work I 
intend to choose for my life work). (T) 005 
ot I am liked by most people who know me. (F) 001 
60 I do not read every editorial in the newspaper every day. (F) .0005 
162 I resent having anyone take me in so cleverly that I have had to 
admit that it was one on me. (T) .025 
i 184 I commonly hear voices without knowing where they come from. (T) .0005 
252 No one cares much what happens to you. (T) .0005 
262 It does not bother me that I am not better looking. (F) -005 
265 It is safer to trust nobody. (T) -025 
278 I have often felt that strangers were looking at me critically. (T) .01 
309 I seem to make friends about as quickly as others do, (F) .005 
} 324 I have never been in love with anyone. (T) 005 
* 354 I am afraid of using a knife or anything very sharp or pointed. (T) .005 
427 I am embarrassed by dirty stories. (T) 005 
482 While in trains, busses, etc., I often talk to strangers. (F) -005 
488 I pray several times every week. (F) 4 . 025 
495 I usually “lay my cards on the table” with people that I am trying H 
to correct or improve. (F) .0005 
540 My face has never been paralyzed. (F) 025 
f 556 I am very careful about my manner of dress. (F) -005 
N taiton: T 5 ei 20 ati italization, 
| Pigs aetters in parentheses indicate Glrection [testi are scored to zenet longer durations of hospitalization, 


o tcols from the item analysis criterion 
subjes Unfortunately, there were only six 
Jects in this “middle” group. The number 
pot form MMPIs in the combined group 
the g tfficient to provide normative data for 
mitt Ort form and thus these protocols were 
ne from further analysis. i 
+ binges total number of protocols in this com- 
8 ala Sam le, excludi was 386. 
Scale | mPle, excluding short forms, ae 
N th Scores were determined for each subjec 
on th combined sample. In scoring protocols 
in po Chronicity scale the number of items 
Corde; Scale that were not answered was re- 
` de, Based on the somewhat arbitrary 
lng ty ut of the author, those protocols hav- 
ty the E€ or more omissions were not included 
Mere gutta analysis. On this basis 28 subjects 
“Cores “CPPed leaving a total of 358, all with 
mS dion, the chronicity scale, who had vari- 
l targo ation in the hospital. The 
son s of stay in the os 
lrag Correlation between scale score a 


i 
| StL of stay in months was .41. 


The sample was successively dichotomized, 
according to duration of stay, at 3 months, 
6 months, 12 months, 18 months, and 23 
months. No dichotomy was made at 9 months 
because of the uneven sampling for durations 
in this range. The frequency distributions of 
chronicity scale scores for the various dura- 
tion dichotomies are presented in Figure 1. 
These distributions have been smoothed and 
adjusted for the base rate of the particular 
duration of stay (Cureton, 1957). These base 
rates were estimated by determining the dura- 
tion of stay for every male patient given an 
MMPI after admission to a large Veterans 
Administration hospital between October 1954 
and September 1956. Estimates were based 
on 240 cases. 

The optimal cutting point to include all 
cases, after following Cureton’s procedure of 
smoothing and adjusting the area under the 
curves for base rate, is the point of intersec- 
tion of the two curves. These points are indi- 
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Fic. 1. Chronicity scale score distributions for dichotomized durations of hospitalization. 


cated in Figure 1. Higher selectivity can be 
obtained, of course, by using two cutting 
points minimizing the area of overlap in the 
mid-range. This paper, because it is con- 
cerned with the total patient population, re- 
ports the findings for one cutting point and 
all of the sample for each dichotomy. 

Use of the scale to improve decision mak- 
ing was the crucial consideration. This was 
evaluated at each of the points of dichotomy 
by the procedures suggested by Meehl and 
Rosen (1955). The predictions or “decisions” 
uniformly were to identify the group staying 


longer rather than the shorter duration group: 
Table 3 presents these data and statements 
of the inequalities indicating whether or not 
use of the scale improves decision making- 
Improvement, or predicting better than one 
could with base rate information alone, occu's 
when the inequality is satisfied. Use of th? 
scale aids prediction of the longer stay P% 
tient in all of the dichotomies except that or 
2 years or longer. It might be mentioned th? 
the scale improves prediction of the shortet 
term patient in all cases. 

Additional cross-validation data were ob 
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TABLE 3 


ACTUAL AND PREDICTED LENGTHS OF Stay in Montus ar SEVERAL CUTTING Points 
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nder proportion of false positives. 
ae curves for base rates. 


Predicted Py 
: duration Actual duration 3 O's Pi +P, 
< 3 Months > 3 Months re 
< 3 Months 123.7 68.0 191.7 
(70.56%) (36.8%) 
> 3 Months 51.7 117.0 168.7 
(29.5%) (03.2%) AS < 68 
z 175.4 185.0 360.4 
(100%) (100°) 
< 6 Months > 6 Months 
< 6 Months 169.7 50.5 229.2 
(77.5%) (42.1%) 
> 6 Months 49.2 81.9 131.1 
(22.5%) (57.9%) 60 < .72 
x 218.9 141.4 360.3 
a (100%) (100%) 
< 12 Months > 12 Months 
S 12 Months 291.6 105.7 397.3 
(95.5%) (78.1% 6) 
> 12 Months 13.8 29.7 43.5 
(4.5%) (21.9%) 09 < 83 
a 305.4 135.4 440.8 
(100%) (100%) 
< 18 Months > 18 Months 
S 18 Months 303.0 69.1 7 372.1 
(95.5%) (80.7%) 
> 18 Months 14.2 16.5 aoe ‘i 
(4.5%) (19.3%) 19 < 81 
à 317.2 85.6 402.8 
(100%) (100%) 
< 23 Months > 23 Months 
S 23 Months 333.6 63.7 307.3 
(95.1%) (81.17%) 
> 23 Mon 14.8 32.1 
th 17.3 5 
: (4.9%) (18.9%) 82 < .79 
= 350.9 78.5 429.4 
(100%) (100%) 
Heigl ; al ified clinical population. Ọ =1 — P. P, = i i 
rat P, Z Proportion of longer durations of stay in a specified cioj PeRhocomies isan artifact produced Boe a alid 
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tained from the Veterans Administration Hos- 
pital, St. Cloud, Minnesota, on 204 neuro- 
psychiatric patients having durations of hos- 
pitalization of 1 year or longer. No data 
for patients staying less than 1 year were 
available. The data obtained were evaluated 
against the earlier data for patients staying 
less than 1 year by the method described 
above. The cutting point determined by the 
earlier cross-validation data was used. Pre- 
diction of durations of greater than 1 year by 
the scale was better than could be expected 
by chance or base rate information alone. The 
inequality Q < Pat By a Pe’ which reflects the ex- 
tent decision making is improved by use of 
the scale, was .69 <.75 for this additional 
data, as compared to .69 < .83 for the origi- 
nal data. Thus, while the improvement in de- 
cision making (prediction) is somewhat less 
than the original data showed, this independ- 
ent sample supports the conclusions drawn 
earlier. Further data on 165 neuropsychiatric 
admissions to the Veterans Administration 
Hospital, Minneapolis, Minnesota, were gath- 
ered. The median length of stay for these 
neuropsychiatric admissions in this general 
medical and surgical hospital was 46 days. 
The sample was dichotomized at the median, 
the frequency distributions of scale scores 
smoothed, and the intersect taken as the op- 
timal cutting point. Predictions based on this 
cutting point also were better than could be 


TABLE 4 
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oa 
Pi+ P: 


expected by chance or base rate information. . 
z s Py 5 P 

The inequality Q < Pi +P. in this case was 

50 < .68. 


RELATIONSHIPS WITH SIMILAR SCALES — 


As Zubin and Windle (1954) might predict, 
studies using MMPI items to predict length. 
of hospital stay have produced remarkably 
little item overlap between scales purporting 
to do the same thing. Meeker (1958) devel- 
oped a 28 item scale by item analysis of the - 
MMPI items against a criterion of chronicity. 
Although his samples were relatively small 
and from the same area it is unusual to find ! 
that only 2 items overlapped with the 55 
items discriminating in the original sample of 
this study. Further, both of these items were 
dropped from the present scale in cross-valida- 
tion. A scale being developed at the Veter- 
ans Administration Hospital, American Lake, 


Washington, has little overlap, if any, with 
the present scale or with the Meeker scale-* 


P. 


The lack of congruence in these independent 
studies, in addition to being somewhat star- 
tling, is provocative, Geographical differences © 
might be suggested but it is not likely that 
this variable alone would produce such di- 
vergence. Whatever the reason, adequate sam- 


° Personal communication from John B. Marks: — 
Veterans Administration Hospital, Anierican Laker 


VALUES 


Predicted duration 


of hospitalization Meeker scale 


Marks scale ` Chronicity scale 


> 46 Days 50 < .56 50 < 
S 3 Months> AB < .56 48 < z ca 2 ms 
> 6 Months’ 60 X .58 60 < 61 ‘we 
S 12 Months? 09 < .76 69 < “00° aa 
S 18 Months? 79 £ .69 79 < 1,000 ces 
: i i 79 < 81 
> 23 Months A 82 < 88° 82-x .79 ji 
> 12 Months. .69 < .75 d 


Washington. 
4 
| 


a Sample of 165 neuropsychiatric admissions to the Minneapolis Vei 


Hospital. 


b Data from original pooled cross-validation sample, N = 259, 
e Inequality not interpretable because cell frequencies approach zero, 
d No cutting point obtained because frequency distributions did not intersect. 


i ‘Sr sed 
ssions to the Veterans Administration Hospital, St, Cloud, Minnesota, that st@Y° 


e Data from 204 neuropsychiatric admi: 
year or longer. 
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pling and cross-validation appear to be the 
Immediate course of action. In the present 
Study an attempt was made to obtain ade- 
quate sample sizes and cross-validation. Re- 
garding the sample, however, the number of 
Subjects who had durations of hospitalization 
between 6 and 12 months is decidedly de- 
ficient. Combined sample size was adequate 
- but undoubtedly could be improved. 
These other scales were evaluated as to 
their Capacity to improve decision making in 
- Comparison with the present scale on the 
Same data samples whenever the other scale 
Scores could be obtained. Table 4 presents 
the Q< P, fi Pa values for each of the scales 
On a number of predictions and samples. The 
_ Present scale uniformly improves decision 
Making more than the other scales, does so 
‘More consistently, and for predictions of 
nger durations of hospitalization. 
a ne might suspect a relationship between 
Scale predicting response to psychotherapy 
m, Che predicting length of hospitalization. 
X ere is only one item, however, which over- 
ie between Barron’s Ego Strength scale and 
Direc tonicity scale. For this one item the di- 
A lon of scoring which would contribute to 
a orable prognosis for psychotherapy from 
nee scale would, in the chronicity scale, 
ete to the prediction of long stay. It 
me AS obvious that the two scales are not 
“asuring the same underlying variable(s). 


C 


DISCUSSION 


2 Bie results of this study have provided a 
th ‘tem scale which improves prediction of 
\ ea Onger stay patient. It can discriminate 
yy ZNgfully for durations of hospitalization 
cig AA 18 months. Although accuracy of pre- 
ton decreases as duration increases, it does 

oF that the scale may have practical 
hry Particularly for isolating the potentially 
tives WC group with a minimum of “false posi- 

i impre Such predictions undoubtedly could be 
Spar, ved by combining probabilities with a 
stn Predictor such as the demographic 
(logg ent developed by Lindemann et al. 
aya, It is also possible prediction might 
q Or een more accurate had slightly less 
of ;, US standards been used in the selection 
es. Although the .05 level was used in 
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the item analysis, the pooled probabilities 
were considerably lower. Predictive efficiency 
may have been enhanced had the .05 level, 
for instance, been used for the pooled prob- 
abilities, thus generating a longer scale. 

The substance of this study is the scale that 
has been developed empirically. While it will 
serve to improve identification of the chronic 
patient, at this point it provides only inter- 
esting clues about self-reported personality 
characteristics underlying chronicity. The 
scale will be factor analyzed to define the 
“roots” of chronicity insofar as they are re- 
flected in these MMPI items. These results 
will be presented in a subsequent paper. It 
was the intent of this study to produce a pre- 
dictive scale which might be independent of 
diagnosis, at least to the extent that it would 
eventually permit isolation of personality fac- 
tors influencing chronicity. Although it is 
most probable that these factors themselves 
will not be entirely independent of diagnosis 
it appeared fruitful to investigate them sepa- 
rately. 

Although considerable care was taken in 
the construction and cross-validation of the 
present scale, in view of the current contra- 
dictory findings in this area it must be viewed 
as tentative and properly subject to further 
evaluation and modification. It should prove 
to be of some immediate interest to the cli- 
nician addressing himself to the study of 
neuropsychiatric chronicity if it is used with 
the proper reservations. Eventually it may in- 
crease our understanding of chronicity and 
consequently improve our ability to study 
and impede it. 


SUMMARY 


The MMPI item pool was item analyzed 
against a dichotomous criterion of neuropsy- 
chiatric hospital chronicity. The 55 items 
which were found to discriminate the criterion 
groups in the original sample were cross-vali- 
dated on the pooled data from three separate 
Veterans Administration neuropsychiatric hos- 
pitals. A 21 item scale was generated which 
was able to predict the “long-stay” patient 
at various dichotomies in duration of stay 
better than one could by chance or by base 
rate information. Although it may be of some 
immediate value, because of the prevalence 
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of contradictory findings in this area, the 
scale should be used with caution until fur- 
ther verification and modification. The scale 
will be factor analyzed to provide some no- 
tion of the underlying “roots” of chronicity, 
at least insofar as they can be tapped by self- 
report on a questionnaire. These results will 
be presented in a later publication. 
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A variety of judgment tasks have discrimi- 
nated those with severe mental or emotional 
disorders from less severely disturbed per- 
Sonalities (Chambers, 1956, 1957; Cooper, 

960). Judgment tasks applied to clinical 
Populations have varied from psychophysical 
Judgments of length of lines, density of dots, 
etc, to projective types of judgment (e.g., 
osing personality traits from photographs, 

ee ambiguous stimuli such as ink- 

oo etc.). Sarbin and Hardyck (1955) de- 

ned the latter type of judgment as “con- 
gel and applied the concept to tasks 
ieee there is no criterion of judgment va- 
TN other than agreement with a specified 
Tm group. 
i the present study a conformance type of 

ment task was given to college students. 
eas hoped that the task might be suffi- 
Prd sensitive to discriminate between well 

JUsted and poorly adjusted students, even 

ot poorly adjusted college students would 
g Senerally compare in severity of pathol- 
tin With the psychotic groups previously dis- 

minated by judgment tasks. 


METHOD 


ning ttre Identification Test (PIT). At the begin 
Wes a a school year, all students at Georgia ao - 
at fie Junior College were given the ye 
Were eit (Chambers & Broussard, 1960). - 
Part 56 male and 186 female students tested. For 
“ach Of this test the subjects were given 10 cards loi 
Staph, Which were eight head and shoulder pho o- 
With 5 of individuals of the same sex as the subject. 
brieg “ach card the subject was given @ list of a 
he Mu avior descriptions representing 21 needs 0 


struc aay (1953) need system. The subject was 
9 -Sto i to match each 
j A he desired. The 


Ubjeeg S any picture as often as 
Needs thus made 10 selections for eac 
Making a total of 210 choices. 


h of the 21 


Scoring. Norms for the popularity of judgment 
choices were based on the choices of 100 randomly 
selected students of each sex from the general stu- 
dent population. The score for any selection was de- 
termined by the number of those in the norm group 
making that selection. Twenty-one judgment sub- 
scores were obtained by summing the 10 choice 
scores for each need. A total judgment score was ob- 
tained by summing all 210 choice scores. The scores 
of the subjects of the norm groups were adjusted to 
eliminate their own contribution to the norm figures. 
The 21 raw judgment subscores and the total judg- 
ment score were converted to standardized scores de- 
rived from the total student population and based 
on an eight-point scale with the mean separating 
points four and five on the scale. 

Subjects. At the end of the school year in which 
the PIT had been administered, each member of the 
faculty was asked to select the 10 best adjusted stu- 
dents of each sex and the 10 most poorly adjusted 
students of each sex from those of his acquaintance. 
The faculty members were instructed to judge stu- 
dents on the basis of emotional stability and ma- 
turity rather than on intellectual ability. None of the 
faculty knew of the PIT results. Fourteen faculty 
members felt they had enough general contact with 
the student body to make selections. Those students 
selected by three or more of the faculty judges as 
best adjusted were chosen for a “well adjusted” group 
and those receiving three or more choices as most 
poorly adjusted were selected for a “poorly ad- 
justed” group. This procedure provided 15 boys and 
16 girls for the poorly adjusted group and 16 boys 
and 17 girls for the well adjusted group. 


RESULTS 


Judgment subscores of well adjusted and 
poorly adjusted boys were compared for dif- 
ferences by means of the ¢ test. The well ad- 
justed group showed a higher mean degree of 
judgment conformity on all 21 subscores and 
they differed significantly (at the .05 level or 
better) from the poorly adjusted group on 19 
of the 21 subscores. 

A similar analysis was applied to the re- 
sults of the well adjusted and poorly adjusted 
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TABLE 1 


TOTAL JUDGMENT SCORES or WELL ÅDJUSTED AND 
7 POORLY Apjustep STUDENTS 


Poorly 
Well Adjusted Adjusted 
Total Judgment 
Score Boys Girls Boys Girls 

1 4 4 1 

2 3 4 

3 2 i 1 2 

4 5 5 2 3 
PrOD nie) 2% E ¥i6idi E E emeiecisieed 3 

5 1 4 4 

6 1 1 3 

7 1 3 1 

8 1 4 3 


a Dotted line indicates mean for general student population, 


female groups. The well adjusted girls showed 
a higher mean degree of judgment conformity 
on all but two subscores (n Dominance and 
n Infavoidance) and they were significantly 
more conforming in judgment than the poorly 
adjusted group on 11 of the 21 subscores. 

Table 1 presents the total judgment scores 
for all four groups. A test of the difference be- 
tween the combined male and female well ad- 
justed groups (N = 33) and the combined 
male and female poorly adjusted groups (V 
= 31) yielded a ¢ of 11.09, significant at the 
.001 level. A cutting point at the mean for 
the general population of students (¥ = 4.5) 
classified 71% of the poorly adjusted students 
below the mean and 85% of the well adjusted 
students above the mean, 

Further comparisons revealed that the well 
adjusted groups scored significantly higher 
than the poorly adjusted groups on the Prince- 
ton Scholastic Aptitude Test (¢ = 2.64, p 
< .02). This finding raised the possibility 
that the PIT judgment score might be cor- 
related with the SAT. A Pearson r of .28 was 
found between the PIT total judgment score 
and the SAT. Based on an N of 212, this cor- 
relation was significant at the -01 level but 
was not high enough to indicate use of the 
PIT judgment score as an index of scholastic 

itude or to suggest that prediction of gen- 
ie tment to a college environment 
eral adjustm judgment score would yield 
pased on the PIT judgment score hater oe 
highly similar results to prediction bas 


the SAT. 
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DISCUSSION 


Evidence from the present study and from 
those previously cited points to impaired or 
deviant perceptual judgment among the emo- 
tionally or mentally disturbed. In an attempt 
to account for the relationship between ad- 
justment and judgment, Sarbin and Hardyck 
(1955) reasoned that a lack of perceptual 
conformance would increase the probability 
of a deviant behavioral response since be- 
havioral reactions are dependent on percep- 
tual reactions. Mental and emotional disor- 
ders are defined by deviant behavior; there- 
fore there is a direct connection between poor 
perceptual conformance and maladjustment, 
according to their theory. 

The above theory does not state what con- 
ditions underlie poor perceptual conformance. 
The conditions could, of course, be due to 
brain damage or be of genetic origin. 

There is a further possibility that judgment 
may be distorted by emotions and attitudes. 
Jurors and judges are dismissed from cases 
where they hold certain attitudes affecting 
the case. Perceptual judgments may be sub- 
verted to maintain what Morgan (1956) has 
termed the self-preservation of attitudes and 
beliefs. According to this theory, there is a 
selective process whereby evidence favorable 
to existing attitudes is registered while nega- 
tive evidence is excluded. It is obvious that 
one who persistently modifies perception 
rather than belief, when the two are incom- 
patible, will eventually find himself at odds 
with others and with reality. It would seem 
better, from a mental health viewpoint, that 
perception should be uncensored and that 
attitudes should þe influenced by perceptual 
evidence. 

Instances of poor perceptual conformance 
may provide indicators of specific psychologi- 
cal conflicts. An inspection of the data of the 
present study showed that, for a given indi- 
vidual, perceptual conformance varied con- 
siderably from one need to the next. Future 
research might profitably attempt. to deter- 
mine whether specific problem or conflict 
areas can be diagnosed by differences in per- 
ceptual conformance scores of the various 
needs as measured by the PIT. 


er 


Judgment of Photographs and Adjustment 


SUMMARY 


_ The group form of the Picture Identifica- 
tion Test was administered to all students at 
a small state junior college at the beginning 
of a school year. Each subject received 21 
Judgment subscores and a total judgment 
Score based on his matchings of photographs 
of people with Murray need descriptions. At 
the end of the school year, 33 students were 
Selected by faculty members to comprise an 
emotionally well adjusted student group and 

Students were chosen as showing poor 
emotional adjustment. 

_The boys chosen as well adjusted made sig- 
nificantly higher scores (more popular match- 
ngs) than the poorly adjusted boys on 19 of 

Ne 21 judgment subscores. The well adjusted 
Birls made significantly higher scores than did 
the Poorly adjusted girls on 11 of the 21 
Judgment subscores. 

On the total judgment score, the difference 
tween the combined well adjusted and com- 
“ed poorly adjusted groups was highly reli- 
able (¢ = 11.09, p < .001). Using the general 
Population mean as a cutting point, 78% of 
«© subjects could be classified by the total 
Judgment score of the PIT in agreement with 
aculty selections of well adjusted and poorly 
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adjusted students. A low but significant r of 
28 (p< .01; N = 212) was found between 
the PIT total judgment score and the Prince- 
ton Scholastic Aptitude Test. 

The results were discussed in relation to 
Sarbin’s theory of perceptual conformance and 
Morgan’s theory of perceptual selectivity as 
a means of insuring the self-preservation of 
existing attitudes and beliefs. 
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THREE RORSCHACH SCORES INDICATIVE OF 
SCHIZOPHRENIA 


IRVING B. WEINER 
University of Rochester School of Medicine and Dentistry 


The use of Rorschach summary scores in 
differential diagnosis, though widespread in 
clinical practice, has seldom been justified by 
the research literature. Attempts to cross- 
validate checklists such as the Miale and 
Harrower-Erickson (1940) signs of neurosis 
and the Davidson Rorschach adjustment scale 
(Davidson, 1950) have typically been unsuc- 
cessful (Berkowitz & Levine, 1953; Corsini & 
Uehling, 1954). Rieman (1953) combed the 
literature for Rorschach signs supposedly dis- 
criminatory between neurotics and schizo- 
phrenics. Of 86 such “indicators” he evalu- 
ated, only 5, or barely a chance expectation, 
discriminated significantly between the neu- 
rotic and schizophrenic subjects he studied. 

Results such as these have dampened re- 
search interest in Rorschach summary scores 
as diagnostic indicators, if the paucity of re- 
cent literature in this area may be taken as a 
criterion. However, related work has pointed 
up potential sources of error in the determi- 
nation of diagnostic signs which often have 
not been taken into account. From papers 
by Cronbach (1949), Fiske and Baughman 
(1953), and Knopf (1956) several guidelines 
for the empirical derivation of Rorschach 
signs may be abstracted: (a) the number of 
independent statistical tests applied should be 
kept at a minimum; (b) nonparametric meth- 
ods of statistical inference should be used; 
c) the response total should be controlled; 
and (d) Rorschach scores and dependent vari- 
ables must be kept independent. The present 
study consists of the empirical selection of 
three Rorschach signs associated with severity 
f psychopathology in an exploratory study 
o To subsequent attempts to cross-validate 
re e summary scores as indicators of schizo- 


phrenia. 


EXPLORATORY STUDY 


The exploratory sample consisted of all 
adult patients in a 6-month period from the 
psychiatric services of a general hospital for 
whom scorable Rorschach protocols were 
available, with the exception of those pa- 
tients whose primary diagnosis was organic 
rather than functional in nature. The sub- 
jects’ hospital and clinic records were utl- 
lized to assign them to one of three diagnostic 
categories: neurosis, which included cases of 
conversion hysteria, obsessive-compulsive neu- 
rosis, anxiety reaction, and neurotic depressive 
reaction; character disorder, which applied to 
instances of personality trait and personal- 
ity pattern disturbances; and psychosis, which 
included schizophrenic, psychotic depressive; 
and involutional psychotic reactions. The 
diagnostic criterion was the label assigned to 
the patient during the period of psychiatric 
evaluation in which the Rorschach had been 
given. For hospital patients, these labels were 
the discharge diagnoses; for clinic patients, 
the recorded consensus of an intake committee 
was used. 

The Rorschach protocols for this sample, 
which numbered 71, had previously been 
scored for the presence of a number of signs 
under investigation in a related context. Three 
of the signs unexpectedly appeared to differ- 
entiate among the diagnostic categories. T° 
the extent that the labels neurosis, character 
disorder, and psychosis represent a continuum 
of increasing disturbance, severity of illness 
was significantly associated with tendencies t° 
(a) give 1 or 2 CF, (b) have a Sum C be- 
tween 1.5 and 3.0, and (c) give at least 1 C 
or C response with no C” responses. Table 
indicates the number of subjects in each diag- 
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Rorschach Scores Indicative of Schizophrenia 


TABLE 1 


FREQUENCIES oF EXPLORATORY Supjyects oF DIFFER- 
ENT DIAGNOSES WITH A GIVEN NUMBER OF SCHIZO- 
PHRENIC INDICATORS 


6 Number of Indicators* 
Diagnostic akion a 


Category 0 1 2 3 
w CD e Deun 
Neurosis 18 2 4 S 
Character disorder 8 4 5 7 
Psychosis 5 2 7 7 
e == e 


"x? = 13.581 with 6 df; p <.05. 


_Rostic category who received from none to all 
three of these signs. Subsequent analysis re- 
vealed no significant differences between the 
three diagnostic groups in mean age or in 
Mean or variance of response total. Further- 
More, each group was composed of approxi- 
mately two-thirds females and one-third 
Males, thus mitigating any influence of sex 


Cross-VALIDATION STUDIES 
Procedu re 


Two Separate samples were utilized to evaluate the 
Neurrent validity of a checklist comprised of the 
ove three summary scores. These samples, A and 
me again selected from case files and — 
Swine of 52 patients from the 6 months fol- 
t ng and g9 patients from the 12 months preceding 
e eXploratory period, Since the psychotic group M 
Dhr ĉXploratory study had contained few nonschizo- 
in tha Patients, only schizophrenics were included 
S © Psychotic group for the cross-validating sam- 
dis n ample A contained 16 neurotics, 18 ne 
tota et and 18 schizophrenics; for Sample B the 
Were 27, 31, and 31. The total population 
in age from 15 to 62 with a mean of 29.62 
Onset had given a mean number of 13.28 Se- 
Were © to the Rorschach. Within each sample the! 
© significant differences among the three diag- 


c Eii in 
mean „Cêtegories in mean and range of age ie 


co 


t to which 
diagnostic 


Ronse Schizophrenic had depende! 
Signs ach findings, it was unlikely 
ls to an er consideration here—1 or 2 
bromin 0, and C or CF without C'—had 

ently to the psychologist’s opinion 


CF, Sum C 
d contributed 
that the 
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patient’s Rorschach was consistent with the presence 
of schizophrenia. To test this ancillary hypothesis, 
five staff psychologists, who had in fact done or su- 
pervised the testing of more than three-fourths of 
the subjects in the study, were asked to select from 
the following list of 10 Rorschach variables 6 which 
they felt were of either primary or secondary im- 
portance in the assessment of schizophrenia: (a) 
number and/or % Dd; (b) Sum C; (c) number 
confabulated and contaminated Rs; (d) number CF; 
(e) number and/or % P; (f) F+%; (g) num- 
ber and/or quality M; (h) number C, Csymb, and 
Cnam; (i) amount bizarre content; and (j) number 
Cc. 


Results 


Table 2 indicates the number of subjects 
with different diagnoses in Samples A and B 
who received a given number of the signs. 
The degree of association between diagnostic 
category and number of signs, as estimated 
by x°, is significant beyond the .001 level of 
confidence for both samples. Further investi- 
gation of Table 2 reveals that the major dif- 
ference exists between the schizophrenic group 
on one hand and the neurotics and character 
disorders on the other. It may be seen that in 
Sample A, 87% of the neurotics and 78% of 
the character disorders received one sign or 
less, while only 22% of the schizophrenics 
had less than two signs; for Sample B the 
corresponding percentages are 81, 68, and 22. 
Chi square values computed for these grouped 
data were significant beyond the .001 level for 
both samples. Comparisons of the relative fre- 
quency with which the neurotics and charac- 
ter disorders received one or less or two or 


TABLE 2 
FREQUENCIES OF SUBJECTS OF DIFFERENT DIAGNOSES 
IN SAMPLES A AND B WITH A GIVEN NUMBER OF 
SCHIZOPHRENIC INDICATORS 


Number of Indicators** 


Sample A Sample B 


Diagnostic 


Category 0 12 83 Oo 1 y 3 
Neurosis 22 0 2 146 6 4 £ 
Character 

disorder 122 2 4 0 Wm f0 9 1 


9 3 42 HE O 


++ For Sample A x? = 28.881 with 6 df; p <.001. For 
Sample B x? = 30.072 with 6 df; p < .001. 


Schizophrenia 3 1 5 
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more signs yielded x? values smaller than 1.00 
R ample. 

i ara analyzed further to deter- 
mine if the three signs were contributing dif- 
ferentially to the above results. Six 2 x 3 
contingency tables were used to assess the 
association between the three diagnostic cate- 
gories and the presence or absence of each 
sign for Samples A and B. The obtained xX 
values were all significant beyond the .01 
level. 

The ratings by the psychologists of which 
Rorschach variables are prominent in their 
evaluation of possible schizophrenia also 
yielded a clear-cut result. Six of the elements 
were endorsed as being of primary or sec- 
ondary importance by at least four of the 
five raters, and a seventh was chosen by three 
raters. The remaining three variables were not 
selected by any of the psychologists as being 
of either primary or secondary importance in 
the diagnosis of schizophrenia. These three 
were Sum C, number CF, and number C’; the 
vatiables under investigation in this study. 

The fact that positive results were thus ob- 
tained for indicators not endorsed by the 
testers might suggest either that the test re- 
ports carried little weight in the final psychi- 
atric diagnosis or that the psychologists’ diag- 
nostic conclusions were little influenced by 
their expressed views on indicators of schizo- 
phrenia. To deal with this question, an addi- 
tional analysis was undertaken with Sample A. 
It was found that the psychologists’ test re- 
ports and the finally established diagnoses 
concurred in the presence or absence of schizo- 
phrenia in 90% of the cases, which casts 
doubt on the first of the above alternatives. 
Investigation of those of the endorsed signs 
which can be objectively scored revealed that 
the schizophrenic group had a significantly 
(p < .05) lower F + % and tended (p < -10) 
more frequently to have given 1 or more pure 
C response and/or 1 or more M— response 
and/or less than 1 M response than the non- 
schizophrenic groups. However, only five of 
the schizophrenics in Sample A had F + %s 
below 70, only five failed to give M, and only 
four gave any M— or C; only four had as 

many as two of these signs. Therefore, these 
scores, though differentiating between groups, 
were not present with sufficient frequency 
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to have much facilitated the psychologist’s 
evaluation in the individual case. Further- 
more, no relationship could be discovered be- 
tween diagnostic category and P, P%, or 
Dd%. Consequently, it would seem in many 
cases that the psychologists, sometimes of 
necessity and sometimes of choice, had based 
their diagnostic impressions on indicators 
other than those they endorsed in the check- 
list. 


Discussion 


The data clearly indicate that likelihood of 
being diagnosed schizophrenic in the psychi- 
atric setting studied is associated with tend- 
encies to give 1 or 2 CF, a Sum C from 1.5 
to 3.0, and C or CF without C’ on the Ror- 
schach. The fact that these three schizophrenic 
indicators were suggested by an exploratory 
study and cross-validated at highly significant 
levels of confidence in two subsequent studies 
makes it unlikely that they merely reflect 
chance fluctuations. The built-in controls for 
age, response total, and sex eliminate several 
frequent sources of error. The author’s em- 
phasis on nonexistent contaminating variables 
is felt necessary because the results cannot 
readily be explained in terms of prevalent 
Rorschach theory. There is no literature which 
suggests that these three signs are associated 
with schizophrenia, and the rating sheet used 
in this study demonstrates that psychologists, 
at least those in the setting where this re- 
search was done, do not consciously utilize 
these signs in evaluating schizophrenic poten- 
tial. That positive results were achieved seems; 
in view of the congruence between the ex- 
aminers’ judgments and the diagnostic labels 
of the subjects, attributable to the finding that 
some of the traditional signs of schizophrenia 
they endorsed were often ignored by them 
and others occurred with insufficient fre- 
quency to figure prominently in their indi- 
vidual diagnostic conclusions. The nature 0 
the design does not allow further inference 
concerning the validity of these tradition4 
signs, however. 

A tentative rationale for the findings maY 
be offered. The three signs taken together TP” 
resent some minimal use of chromatic col! 
without use of achromatic color. Individual® 
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who avoid color almost entirely or who use 
both chromatic and achromatic color freely 
Would not receive the signs. It may follow 
that, within a patient population, those per- 
Sons who use color freely are displaying the 
emotional lability frequently associated with 
Tepressive defense, while those who avoid 
Color are manifesting the isolation of affect 
Which accompanies intellectual defenses. The 
Temaining patients, having neither pattern, 
might be viewed as lacking a stable defensive 
Structure and being prone to schizophrenic 
Feactions under stress. Other explanations are 
Certainly possible, but confirmation of any 
hypotheses suggested by the data would re- 
Wire additional research. 

Tt is the author’s feeling that even with 
Present knowledge the data have value for the 
Clinician. It should be noted that the signs 
Were not derived and cross-validated by com- 
Paring blatant schizophrenics with normals. 
. Ychological test data gathered in such stud- 
tes are often useful for making discriminations 
only in situations where the discrimination is 
50 obvious that psychological tests are super- 

uous. The subjects in this study were se- 
ected from actual case files and typically had 
teferred because the diagnosis was not 
en Hence they constitute the type of diag- 
°stic problem with which the clinician must 
“al in his daily practice, and it is within such 
ten Pulation that the three signs were SO ex- 
ag mely in accord with the diagnoses finally 

‘ablished, The signs by no means identified 

the schizophrenic patients. However, the 

è seem to justify their inclusion among 
strug Rorschach variables commonly a 
Dhreni 2 relating to the presence of a schiz 

lc disorder, 
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SUMMARY 


Three Rorschach signs—1 or 2 CF, Sum C 
between 1.5 and 3.0, and C or CF without 
C’—were found significantly associated with 
severity of psychopathology in an exploratory 
study. In two cross-validating studies, with 52 
and 89 subjects, each of these signs was re- 
ceived significantly more frequently by schizo- 
phrenic patients than by neurotics and char- 
acter disorders. The design contained controls 
for age and sex and for mean and variance of 
Rorschach response total. A tentative rationale 
for the results is offered, and it is felt that ` 
the data recommend these signs for inclusion 
among Rorschach criteria for the presence of 
schizophrenia. 
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ANOTHER LOOK AT MMPI PROFILE TYPES IN 
MULTIPLE SCLEROSIS 
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Canter (1951) reported a descriptive study 
of MMPI profiles of patients with multiple 
sclerosis (MS). Although recognizing the limi- 
tations of the approach, he calculated the av- 
erage MMPI profile for this group of 33 
World War II veteran patients and inferred 
from the mean profile for the group that the 
typical personality configuration in MS in- 
cluded a reaction to the stress of the illness 
with depression and its accessory symptoms. 

When depression is a major variable under 
study, the averaging of MMPI profiles can 
obscure important profile differences espe- 
cially if subsamples are combined, one of 
which has very high D scores and the other 
of which has very low D scores. This is a par- 
ticularly important consideration in the study 
of an illness such as MS in which both reac- 
tions of high depression and low depression 
because of denial and repression have been ob- 
served. If discrete profile types were to be 
produced reflecting each of these reaction 
types, such an important difference would be 
cancelled out and masked by averaging. 

The aim of the present study was to at- 
tempt to check Canter’s MMPI findings and 
to answer the following questions: (a) Can 
a typical response to MS in the direction of 
depression be inferred from the MMPI? (b) 
Among patients with neurological lesions, are 
MMPI profiles indicative of depression more 
common in MS than in other conditions? To 
investigate the latter question, a control group 
of neurological patients was selected who had 
suffered brain injury from external causes. 

A second purpose of this paper is to report 
on significant relationships between MMPI 


1 We wish to thank William Schofield, Jan Duker, 
and Irving Gottesman for suggestions about the 
preparation of this manuscript. 


profile characteristics and illness and demo- 
graphic variables which became apparent when 


profiles were studied as depressed and non- , 


depressed types rather than as average group 
profiles. 


METHOD 
Samples 


The MS sample consisted of 25 male veterans hos- 

pitalized on the Neurology Service of the Minneapolis 
Veterans Administration Hospital who had receive 
a medical diagnosis of MS. The control group cOn- 
sisted of 25 male veterans from Neurology with 4 
medical diagnosis of traumatic brain injury. Except 
for three cases in the MS group, the samples repre- 
sented all of the cases with these diagnoses who had 
been administered the MMPI and Wechsler-Bellevue 
on the Neurology Service during the years 1949 to 
1952, Three MS cases currently on the Neurology 
Service were tested and included in the analysis t° 
increase the size of the sample. 
_ Age, 1Q, and duration of illness data are reported 
in Table 1. The group differences in mean age an 
mean IQ were not statistically significant. The mean 
duration of illness of the Minneapolis MS sample 
was 58 months compared to 30 months for the 
control group; this difference is reliable (p< 001). 
Canter’s MS group would appear to be comparable 
to the present MS group in age and duration of ill- 
ness although there may be differences between the 
groups in socioeconomic status, Also, Canter studie! 
outpatient veterans while the present sample of vet 
erans was hospitalized, 

Table 1 also reports the MMPI average T sco" 
of the three samples, All three of the mean profiles 
were highly similar in shape; Canter’s mean Pt h 
shows higher elevations on Hs, D, and Hy than gE 
other two profiles, In addition, the Minneapolis 
sample is significantly higher on D and Sc than the 
Minneapolis control group. The magnitudes of the 
latter differences are small. ° 


Procedure 


The Minneapolis MS rere Sin 
sa and control samples were ° 7 
divided on the basis of MMPI scores. The ww 
Profile groupings were based on the following oad 
siderations, First, since there is a relatively high 
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TABLE 1 
Mean Ace, IQ, Duration oF ILLNESS, AND MMPI T Scores FoR MINNEAPOLIS 
MULTIPLE SCLEROSIS AND CONTROL SAMPLES AND CANTER MULTIPLE SCLEROSIS SAMPLE 
a Duration : 
Sample Age IQ (months) L F K Hs D Hy Pd Mf Pa Pt Sc Ma Si 
Minneapolis a 
ae mpl 32.9 102.4 58 55.4 55.0 S94 71.6 69.6 68.0 60.0 51.5 54.7 623 65.0 54.6 52.0 
N= 25) 
Minneapolis 
ne 
ae 30.6 102.9 30.2 54.2 53.5 56.2 68.7 623 65.6 S64 S14 52.3 59.9 60.1 57.1 49.2 
N = 25)) 
Canter MS 
sample 32 — 48 55 55 58 St 79 75 59 55 53 63 ot 55 = 
-A = 33) Spei i Ei 


cidence of one scale being elevated over T score 70 
even in normal samples (Hathaway & Mechl, 1951) 
ìt was decided to classify the profiles as abnormal if 
two or more scales were clevated beyond the normal 
limits, Second, since the investigation focused on de- 
Pression, it seemed rational to cut the samples on the 
asis of elevation on the D scale. Third, inspection 
N the set of profiles suggested that the abnormal- 
hormal dichotomy and the depressed-nondepressed 
categories would describe much of the relevant in- 
Ormation obtained in a majority of the profiles. 

The following categories and rules were adopted: 

1. Normal Depressed: Not more than one scale 
Over 70, 

D>60 


2. Normal Nondepressed: Not more than one scale 
Sver Fo, 


D <60 


w Abnormal Depressed: Two or more scales over 


D >80 


in S bnormal Nondepressed: Two or more scales 
e 70 


D <80 


RESULTS 


distribution of 


ab e 0 ting 
shows the resulti lis ] S an 


€ Profile types in the Minneapo 


TABLE 2 


OF PROFILE TYPES IN 


Fr 
EQurncy MULTIPLE ScLEROSIS 


E AND CONTROL GROUPS — 
Multiple 1 

Profile Type Sclerosis Contro = 
$ g 
Normal Depressed 5 3 
ae Nondepressed T 1 
menin] Depressed g 11 

ormal Nondepressed i 


control samples. The chi square test revealed 
the distributions of the two samples to be re- 
liably different (x? = 7.76, p < .05). 

Table 3 shows the mean age, IQ, duration 
of illness, and MMPI T scores for each of 
the four profile types in the MS and control 
groups. Within the four MS profile types, sig- 
nificant F ratios were obtained for age (p 
< .05), IQ (p< .01), and duration of ill- 
ness (p < .01). From the Shefié test (Sheffé, 
1953), which was applied to make all pos- 
sible comparisons of means, it appears that 
(a) the Normal Nondepressed type was sig- 
nificantly younger than the Abnormal De- 
pressed type, (b) the Normal Nondepressed 
type had a significantly higher mean IQ than 
the Abnormal Depressed type, (c) the Nor- 
mal Nondepressed type had a significantly 
shorter duration of illness than any of the 
three other types, and (d) the Abnormal Non- 
depressed type also had a significantly higher 
mean IQ than the Abnormal Depressed type. 

No significant F ratios were found in the 
analysis of the control group. 

There appeared to be an interaction be- 
tween the age, IQ, and duration of illness 
variables within the profile types in the MS 
sample. Compared to the mean score for the 
total Minneapolis MS sample, every case of 
the Normal Nondepressed profile type had 
lower than average age and higher than av- 
erage IQ and shorter than average duration 
of illness except for one case where age was 
at the mean. Seven of the eight cases of the 
Abnormal Depressed type had IQ scores be- 
low the mean of the total MS group. 
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DISCUSSION 


From the present findings, it would seem 
Safe to state that there is a reaction to MS in 
the direction of severe depression more fre- 
quently than in a neurological disorder such 
as traumatic brain injury. Thirty-two percent 
of the MS sample obtained MMPI profiles of 
the Abnormal Depressed type compared to 
47% of the control sample. The longer dura- 
tion of illness in the MS group could be a fac- 
tor contributing to the greater incidence of 
such profiles in the MS sample. Vieg (1947) 
reported that the most important mechanism 
used by patients in adjusting to MS was re- 
Pression, but that as the disease progressed 
this kind of equilibrium could not be main- 
tained and the patients gave up their front of 

eing emotionally well adjusted and happy. It 
Would not seem warranted to call dysphoria 
the typical reaction to MS since only a third 
Of the sample fell into the Abnormal De- 
Dressed category. Rather, it would appear that 
there are several kinds of reaction to the dis- 
fase. Philippopoulos, Wittkower, and Cousi- 
neau (1958) found a broad range of person- 
ality structures in a group of 40 MS patients 
Studied through interview and psychological 
tests, : 

Most studies of MS (Grinker, Hamm, & 

Obbins, 1948: Langworthy, 1948; Philip- 
Popoulos et al., 1958; Sugar & Nadell, 1948; 

leg, 1947) seem to show a strong thread of 
“onsistency in reporting that repression, in- 
ability to express hostile feelings directly, 
°verconformity, and/or hysteroid traits 1 
ie are common in patients with MS 
as er after, or both before and after the — 
shoujo 2nifestations of the MS es 
Sonal; be noted in passing that sin aaa 

ality traits have been reported in 
somite generally considered to be psycho- 

‘atic in nature such as the recent investi- 
ation of ulcer patients by Marshall (1960). 
Sion Support for the observation ag a 

and denial are preferred mechanisms 0 
“ast some MS patients could be found in 
Cou Present MMPI data but the same arr 
Woulg be found for the control samp Sa 
Effect be Impossible to distinguish cause 
IN any case. 
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On the one hand, it would seem that it is 
unwarranted to make sweeping generalizations 
about typical emotional reactions to MS, but 
on the other hand, it is not necessary to adopt 
a narrow idiographic point of view. There 
is evidence that multivariate classification on 
demographic variables such as age, intelli- 
gence, duration of illness, and perhaps sex, is 
needed to establish most effectively the rela- 
tionships between illness characteristics and 
personality variables. In the present study, it 
would appear that severe depression is more 
likely to occur in less intelligent male MS pa- 
tients and that younger, more intelligent male 
patients with shorter duration of illness are 
more likely to have a repertory of responses 
which enable them to avoid depression. These 
relationships are not evident in the control 
sample. The simplest explanation for the ab- 
sence of such relationships in the control sam- 
ple would be in terms of the disabling and 
progressive nature of MS which might evoke 
responses which would not be called forth by 
a brain injury that was more specific in its 
consequences and not progressive. However, 
this explanation would not seem to be com- 
pletely satisfactory and the possibility that at 
least some of the MS patients would not over- 
lap the control patients on significant person- 
ality characteristics cannot be ruled out. 

There is evidence in the present data of the 
need for longitudinal analysis in the study of 
MS. Data concerning preillness educational 
and occupational attainment were incomplete 
but suggested the need for further study of 
the possibility that the Abnormal Depressed 
MS cases had lower IQs and the Normal Non- 
depressed MS cases had higher IQs which 
antedated the onset of diagnosable MS symp- 
toms. The changes in patients over time after 
the development of symptoms is illustrated 
by the case of a 31-year-old college graduate 
with an IQ of 129 who received a diagnosis 
of MS shortly after the development of symp- 
toms of the disease. At that time he obtained 
an MMPI profile with a peak score on Hy at 
65 which would place him in the Normal Non- 
depressed profile category. Eight years later 
he obtained an IQ of 104 and an MMPI pro- 
file of the Abnormal Depressed type with a 
peak on D of 109. 
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In general, it might be concluded that it is 
dangerous to use any single measure of psy- 
chological status, presuming that such status 
is sensitive to a condition of illness, when it 
is either established or likely that the psycho- 
logical variable is correlated with demographic 
and other nonillness variables which have not 
been controlled. 


SUMMARY 


MMPI and Wechsler-Bellevue performance 
of 25 patients with MS and 25 control pa- 
tients with traumatic brain injuries indicated: 

1. There was a reaction of severe depres- 
sion in 34% of the MS group compared to 
4% of the control group as inferred from 
MMPI profile types. This supported Canter’s 
previous findings in part but did not appear 
to warrant the generalization that the typical 
response to MS is in the direction of depres- 
sion. 

2. MS patients with profile types indicating 
severe depression had lower IQs than average 
for the group of MS patients. 

3. Younger, more intelligent MS patients 
with shorter duration of illness than average 
for the group tended to obtain nondepressed 
profiles. 

4. The MS group differed from the control 
group in that no relationship of profile type 
with IQ, duration of illness, or age was found 
in the control group. This was considered 
most likely to be due to the differing nature 
of the diseases but alternative explanations 
could not be ruled out. 

5. The importance of a longitudinal ap- 
proach to MS was indicated. 
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6. The need was pointed out for control of 
all possible nonillness variables in order to be 
able to conclude that any single psychological 
characteristic is sensitive to a condition of 
illness. 


REFERENCES 


Canter, A. H. MMPI profiles in multiple sclerosis. 
J. consult, Psychol., 1951, 15, 253-256. 

Grinker, R. Ham, G. C, & Rospins, F. P. Some 
psychodynamic factors in multiple sclerosis. Pro- 
ceedings of the twenty-eighth Annual Meeting of 
the Association for Research in Nervous Mental 
Diseases, 1948. 

Hatnaway, S, R., & Meenr, P. E. The Minnesota 
Multiphasic Personality Inventory, In, Military 
clinical psychology. (Department of the Army 
Technical Manual TM8:242, Department of the 
Air Force Manual AFM 160-145) Washington, 
D. C.: United States Government Printing Office, 
1951. 

LaxcwortHY, D. R. A survey of the maladjustment 
problems in multiple sclerosis and the possibilities 
of psychotherapy. Proceedings of the twenty- 
eighth Annual Meeting of the Association for Re- 
search in Nervous Mental Diseases, 1948. p 

MARSHALL, SIMONE. Personality correlates of peptic 
ulcer patients. J. consult. Psychol., 1960, 24, 218- 
223. 

Puiirroroutos, G. S, Wıirrkower E. D, & 
Cousineau, A. The etiologic significance of emo- 
tional factors in onset and exacerbations of multi- 
ple sclerosis. Psychosom. Med., 1958, 20, 458-474. 

Scnerré, H. A method for judging all contrasts in 
analysis of variance. Biometrika, 1953, 40, 87. , 

Sucar, C., & Navett, R. Mental symptoms in multi- 
ple sclerosis. J. nerv. ment, Dis., 1948, 98, 267-280. 

Viec, M. J. Clinical investigation into the psycho- 
logical aspects of multiple sclerosis. Unpublished 


doctoral dissertation, University of Minnesota, 
1947, 


(Received August 25, 1960) 


Journal of Consulting Psycholo, 
1961, Vol. 25, No. j4 erd 


DICHOTOMOUS EVALUATIONS IN SUICIDAL INDIVIDUALS 


CHARLES NEURINGER 


Suicide Prevention Center, Los Angeles, California 


It need not be pointed out that suicide is a 
leading cause of death in the United States 
and that it constitutes a serious mental health 
Problem. Research is slow and difficult be- 
Cause of a number of complex methodological 
difficulties, among which is the problem of an 
adequate definition of the suicidal rubric. So 
Many different kinds of behaviors and experi- 
ences come under this heading that it is diffi- 
Cult to perceive a common core of psychologi- 
cal characteristics that can be denoted as “sui- 
cida? and easily discriminated from other 
Pathological states. 
in S. Shneidman (1960) has proposed 
suic; a core of characteristics of thinking in 
i 'cidal individuals. He posited that the neu- 

Ptic-suicidal individual is the perpetrator of 

Ought distortions, the presence of which 
to a high probability of suicide occur- 
dist, One of these proposed types of thought 

Ortions was the tendency to think in terms 
absolute value dichotomies. : 
Y Dichotomous Evaluative Thinking 1s 
— the polarization of thought into an ex- 
V me double bind value system (e.g, good 
Dad, right vs. wrong, beautiful vs. ugly”). 
© Polarization is considered to be extreme 


ir l 
alet the object of thought is considered as 
With mostly good” or “all or mostly bad, 

fing, VEY little modulation of the object into 


p © discriminations, commonly called “shades 
gre 2 

tre Reidman feels that rigid adherence to ex- 
lea, p ichotomous Evaluative Thinking can 
in ee Situations that are lethal (€8-, z an 
Not ‘dual is dissatisfied with his life and does 
not -Od it wholly acceptable to him, he does 
life ink of alternate ways of changing his 
ni of death as the only alternative). 
Sith, Stomous thinking seems to be : 
an «Or kind of value thinking and no 
” kind of thinking. Shneidman Con- 


te an 


cluded that the extreme dichotomous thinker 
is trapped in a double bind and must always 
embrace one of the extremes. The implication 
is that if one adheres to a strict dichotomy of 
thought, then one has few or no degrees of 
freedom with which to maneuver. Alternatives 
cannot be perceived and the situation becomes 
unresolvable, thus leading to ideas of escape 
through death. 

Should Dichotomous Evaluative Thinking 
be a distinguishing characteristic of suicidal 
thinking, this would lend support to the view 
that there are common characteristics that are 
organized into a core personality, which could 
be called the “suicidal personality.” 


PROBLEM 


The general question that the following re- 
search attempted to answer was whether sui- 
cidal individuals can be differentiated from 
(a) another emotionally disturbed group and 
(b) from a normal control group, as far as 
the presence and extent of extreme Dichoto- 
mous Evaluative Thinking was concerned. 


METHOD 


The Semantic Differential method, which was de- 
veloped by Charles Osgood (1957), was selected as 
a means of studying the hypothesis that Dichotomous 
Evaluative Thinking is a characteristic of suicidal 
thought. 

Dichotomous thinking of an extreme kind should 
reflect itself in extreme evaluations of the concept 
being tested. If suicidal subjects evaluate more di- 
chotomously than the control subjects, they should 
score a concept as being “more good, more beautiful, 
more unpleasant and more sad” than the control 
group evaluations of the same concept. 

Another reflection of dichotomous thinking would 
be found in the amount of difference in values be- 
tween two concepts that are semantically opposite 
such as love and hate or God and the devil. Greater 
Dichotomous Evaluative Thinking of an extreme 
kind in the suicidal individuals should reflect itself 
in the perception of greater value differences on the 
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same pair of opposing concepts when compared with 
the control groups. £ r: 
Consistent with Osgood’s suggestion concerning the 
use of the Semantic Differential when studying 
values, only those scales high on the evaluation fac- 
tor were utilized. In this study nine scales (good- 
bad, dirty-clean, nice-awful, unpleasant-pleasant, 
fair-unfair, worthless-valuable, happy-sad, dishonest- 
honest, and beautiful-ugly), and 18 concepts (de- 
mocracy, death, God, honor, communism, mother, 
success, love, life, murder, devil, father, myself, 
shame, failure, other people, hate, and suicide) were 
used. The concepts that were chosen reflect impor- 
tant people in one’s life, political systems, emotional 
states, behavioral acts, theological entities, etc. These 
kinds of concepts were selected on the assumption 
that they would tend to elicit strong evaluative re- 
actions in most people. It was felt that more imper- 
sonal concepts such as steel, apple, or skyscraper 
would not evoke measurable evaluative thinking. It 
is difficult to categorize steel as being good or bad, 
honest or dishonest, happy or sad, fair or unfair, etc. 
The concepts that were chosen had less veridical ob- 
jectivity and therefore should be open to greater 
value interpretation. 

Twelve of the concepts were organized into six 
semantically opposed pairs (God-devil, life-death, 
honor-shame, success-failure, love-hate, and democ- 
racy-communism) in order to test for the amount of 
perceived value difference between the members of 
the paired concepts. 


Scoring 


The scoring of a subject’s Semantic Differential for 
value extremeness was accomplished by assigning a 
score of 3 for extreme judgments like “very good” or 
“very bad,” a score of 2 for moderate judgments like 
“moderately beautiful” or “moderately ugly,” a score 
of 1 for judgments such as “mildly worthless” or 
“mildly valuable.” A score of 0 was assigned to the 
midpoint rating. The more extreme the judgment, the 
higher the score on the scale. The mean value ex- 
tremeness score per scale was recorded for the sub- 
ject. Each subject made 162 scale judgments. 

The scoring of a subject’s Semantic Differential for 
value differences was accomplished by comparing the 
rating differences between a pair of oppositional con- 
cepts scale by scale. The difference score per scale 
was the number of rating spaces separating the two 
judgments, including the scored spaces on the same 
scale. Complete opposition such as life being “very 
good” and death as being “very bad” received a score 
of 7. Complete identity received a score of 1. The 
greater the amount of disparity in scale ratings, the 
greater the value difference score. The mean value 
difference score per scale was recorded for the sub- 
ject. There were 54 scale difference scores available 
for each subject. The greater the value difference 
score, the greater the value disparity between the 
opposing concepts. 

Congruent with the hypothesis that suicidal indi- 
viduals evaluate more dichotomously than other peo- 
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ple, it would be expected that the suicidal subjects in 
this study would have greater value extremeness and 
difference scores than the control subjects. 


Subjects 


The subjects were gathered from five Veterans Ad- 
ministration hospitals and one large metropolitan 
general hospital. Four of the hospitals were in the 
Los Angeles area and two were in the Missouri- 
Kansas area. Four of the hospitals were general medi- 
cal and surgical and two were neuropsychiatric. y 

Three groups of 15 subjects each were utilized in 
the present study. The first of these groups was com- 
posed of individuals who had made a serious attempt 
at killing themselves (S group). The second group 
were subjects suffering from marked psychosomatic 
difficulties (PS group) and the last group was com- 
Posed of normal hospitalized patients (N group). 

All the subjects were native-born, Caucasian males 
between the ages of 21 and 55. They were of normal 
intelligence as defined by the Information subtest 
of the Wechsler-Bellevue Intelligence Scale, Form I 
(Wechsler, 1944). (The S, PS, and N groups earned 
mean Information subtest scores of 16.4, 17.2, and 
17.6, respectively. The corresponding standard devia- 
tions were 4.1, 2.9, and 3.4, An analysis of variance 
was carried out and an F ratio of 1.46 was found 
which for 2 and 42 degrees of freedom was not sig- 
nificant at the .05 level of confidence.) None of the 
subjects were psychotic and they were all in good 
enough mental and physical shape to partake in the 
research. All the subjects were hospitalized at the 
time of the Semantic Differential administrations. The 
suicidal subjects were hospitalized because of their 
suicide attempts, and the psychosomatic subjects 
were in the hospital receiving treatment for their 
physical difficulties, Hospitalized normal patients were 
used as a control for the effects of hospitalization. 

None of the subjects received any kind of psy- 
chiatric or psychological treatment previous to par- 
taking in the research Project. Great care was taken 
to select subjects who had had no electroshock; 
ataratic drugs, individual or group psychotherapy- 

Besides these general characteristics of the subjects 
the suicidal subjects were chosen on the basis of their 
having made a bona fide suicidal attempt. The es- 
tablishment of the suicide attempt was made by (4) 
the verbal admission of the patient and (b) the pres- 
ence of some objective evidence of self-destruction 
such as a high barbiturate level in the blood, noxious 
chemical substances lavaged from the stomach, a? 
deep surgical wounds on the body, In addition, the 
Suicidal subjects carried a diagnosis of neurosis as 
defined by the American Psychiatric Association dias- 
nostic manual (1952). 

The Psychosomatic subjects were categorized n 
(a) no history of suicidal threat or attempt and (b 
a diagnosis of psychosomatic disability as defined 
the above cited manual. re 

The normal subjects were defined by (a) no eV 
dence of suicidal threat or attempt and (b) no ev 
dence of emotional disturbance, The absence of e™? 
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tional difficulties was approximated with critical items 
from the Cornell selective indices (Mittlhemann & 
Brodman, 1946). The normal patients were in the 
hospital only for medical and surgical problems of a 
transient nature, such as appendectomies, tonsil re- 
movals, and minor fractures. Patients that suffered 
from physical difficulties such as skin diseases or in- 
ternal gastric disorders were not utilized because of 
the possibility of unknown psychosomatic involve- 
ment, Patients who had remained in the hospital 
longer than 1 month were eliminated since there 
Might be unknown psychological factors which were 
Propedeutic to their staying in the hospital longer 


than prescribed by the nature of their physical dis- 
abilities, 


RESULTS 


_ The results of the study suggest that there 
18 little difference between the suicidal and 
Psychosomatic subjects as far as extreme Di- 
Chotomous Evaluative Thinking is concerned. 

Owever, a significant difference was found 
between the two emotionally disturbed groups 
and the normal hospitalized subjects. 


Extremeness Scores 


The percentage of the different kinds of 
Value extremeness responses made by the three 
Stoups of subjects is presented in Table 1. 

Or the high value extremeness category 
(Number 3, indicating extreme judgments 
Such as “very good, very bad,” etc.), the 
Suicidal group earned the highest percentage 
pf responses ( 71%), while the normal sub- 
Jects earned the lowest percentage of responses 
(52%). This relationship is reversed for the 
ew value extremeness category (Number 0, 
adicating neutral or equal judgments such as 

‘Wally good and bad, equally honest and 
p honest,” etc.), For these modulated rat- 
"8s, the normal group earned the highest per- 


D TABLE 1 

pecENrAGE or Responses Mape ON EacH oe 

OL REMENESS CATEGORY BY THE SUICIDAL, pee 

'ROSOMATIC, AND NORMAL SUBJECTS ON THE 
MANTIC DIFFERENTIAL TEXT 


Extremeness Category 
A es 
Group 3 2 1 0 Total % 
$ 00 
Pg oup m ®© da D i 
N Soup 10 15 
Ri a $ 5 {00 
goe 52 45) ws 25 
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TABLE 2 


MEANS AND STANDARD DEVIATIONS or THE VALUE 
EXTREMENESS SCORES PER SCALE FOR THE SUICIDAL, 
PSYCHOSOMATIC, AND NORMAL SUBJECTS 


Measure S Group PS Group N Group 
Mean 2.45 2.26 1.93 
SD 37 39 46 


centage (259%), and the lowest percentage 
(10%) was obtained by the suicidal subjects. 
The distribution of extremeness scores for the 
psychosomatic group lies closer to the suicidal 
group than to the normal group. The means 
and standard deviations of the extremeness 
score per scale are presented in Table 2. 

It appears that all the subjects used ex- 
treme value judgments and their judgments 
were distributed in roughly the same manner. 
But the suicidal and psychosomatic subjects 
made more extreme value judgments than the 
normal subjects. It was hypothesized that the 
suicidal subjects, if they used Dichotomous 
Evaluation Thinking to a greater degree than 
the control subjects, would assign more ex- 
treme value ratings to the Semantic Differ- 
ential scales than the control subjects. A sim- 
ple analysis of variance of the scale extreme- 
ness scores was computed and the summary 
is presented in Table 3. An F ratio of 6.62 
was found, which was significant at the .01 
level of confidence. The source of the differ- 
ences between the means was traced by the 
Tukey method (1949) and it was found that 
both the suicidal and psychosomatic groups 
earned significantly higher value extremeness 
scores than the normal group. There was no 
statistical difference between the two emo- 
tionally disturbed groups. 


TABLE 3 


ANALYSIS OF VARIANCE OF THE VALUE EXTREMENESS 
SCORE PER SCALE FOR THE SUICIDAL, Psycnoso- 
MATIC, AND NORMAL GROUPS 


Source df Mean Square F 
Between 2 1.06 6.62* 
Within 42 .16 
Total 44 

* Significant at the .01 level of confidence. 
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TABLE 4 


PERCENTAGE OF VALUE RESPONSES MADE ON FEACH 
DIFFERENCE CATEGORY BY TH UICIDAL, Psycno- 
SOMATIC, AND NORMAL SUBJECTS ON THE SEMANTIC 
DIFFERENTIAL Tesi 


Difference Category 


Group 7 6 & #2 BB 2@ 1 Total % 
5 6 4 5 100 
S group 50 16 6 13 4 
me group 39 6 14 22 65 8 100 
N group 26 10 8 26 7 6 17 100 


Difference Scores 


The percentage of scale value difference 
scores earned by the three groups of subjects 
is presented in Table 4. The left hand part of 
the table represents the more extreme value 
difference categories, while the right hand part 
of the table represents the less extreme value 
difference categories. The sucidal subjects 
placed 50% of their responses in the most 
extreme value difference category, while only 
26% of the normal subjects’ responses were 
found there. The least extreme value differ- 
ence category finds the normal group con- 
tributing 17% of their responses, while the 
suicidal group contributed 5% of their re- 
sponses. The psychosomatic group distribu- 
tion lies in between the suicidal and normal 
groups and is closer to the former. The means 
and standard deviations of the difference score 
per scale are presented in Table 5. 

From the tabular material, it can be seen 
that all the subjects made a great number of 
value differentiations between the opposing 
concepts. The only difference seems to lie in 
the extent of the differences. It was hypothe- 
sized that the suicidal subjects, if they used 
Dichotomous Evaluative Thinking to a greater 


TABLE 5 
MEANS AND STANDARD DEVIATIONS OF THE VALUE 


DIFFERENCE SCORES PER SCALE FOR THE SUICIDAL, 
PSYCHOSOMATIC, AND NORMAL SUBJECTS 


Measure S Group PS Group N Group 
Mean 5.59 5.03 4.34 
SD -80 -80 1.00 
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degree than the control subjects, would per- 
ceive greater value differences between the 
paired oppositional concepts than the control 
subjects. A simple analysis of variance was 
carried out on the scale difference scores and 
a summary of the analysis is presented in 
Table 6. An F ratio of 6.77, significant at the 
.01 level of confidence, was found. The source 
of the significant difference was traced by the 
Tukey method (1949) and it was found that 
the suicidal and psychosomatic groups earned 
significantly greater value difference scores 
than the normal group. There was no sta- 
tistical difference between the two emotion- 
ally disturbed groups. 


TABLE 6 


ANALYSIS OF VARIANCE OF THE VALUE DIFFERENCE 
SCORE PER SCALE FOR THE SUICIDAL, f psycios0- 
MATIC, AND NORMAL GROUP% 


Source dj Mean Square Hf: 
Between 2 5.15 6.77* 
Within 42 .76 
Total 44 


* Significant at the .01 level of confidence, 


Discussion 


The evidence gathered in this study did 
not support the contention that Dichotomous 
Evaluative Thinking was primarily or solely 
a characteristic of the thinking of the suicidal 
individuals utilized in this study. The ie Dit 
subjects tended to score their Semantic Di 
ferentials in such a manner as to reflect the 
presence of dichotomous evaluations. How- 
ever, both control groups showed this tnd- 
ency, and the psychosomatic subjects showed 
just as many dichotomous evaluations as the 
suicidal subjects. None of the three groups 
of subjects demonstrated exclusive function- 
ing as far as Dichotomous Evaluative Think- 
ing was concerned. The differentiation amon 
the three groups seemed to be the amount of 
Dichotomous Evaluative Thinking that the 
subjects used. 

It should be stressed that the findings of 
this study should not be generalized to other 
aspects of thinking, but should be restricte 
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to the cognitive organizations of values. Tt is 
not known whether emotionally disturbed in- 
dividuals utilize more dichotomous evaluations 
than normal people when dealing with im- 
personal objects such as “tin cans or soup 
Spoons.” Neither is it known whether Di- 
p chotomous Evaluative Thinking extends into 
areas of cognitive organization where values 
do Not play a major role (e.g., syllogistic rea- 
Soning, mathematics, etc.). 
\ *ven though it has been suggested by Os- 
good (1957) that most everything is thought 
oiin terms of some kind of value orientation, 
X annot be said that Dichotomous Evalua- 
tix Thinking would have been found if scales 
'gh on the Activity and Potency factors were 
used in the present study. It may well be that 
teel. is Stronger or that carbonated water is 
i žzier” for emotionally disturbed individuals 
i comparison to normal people, but data that 
ould Substantiate such a contention are not 
available, 
The concept of Dichotomous Evaluative 
Inking should be restricted to objects of 
men ent which are pregnant with personal 
Sha and therefore elicit strong value re- 
ae I most people. i 
f, on the basis that self-destruction is a 
astic solution to personal difficulties, it 1s 
jo that the suicidal individuals were 
€r the greatest amount of stress, and that 
We l 
d 
a 
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Assy 
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th 
anal subjects had comparatively the 
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loy a 
evel of stress, then the results of this 


ta Might be compatible with the inter- 


a that stress causes Dichotomous 


ate Thinking to take on pathologi- 
Nonadaptive characteristics. A C08- 
Ode of thinking that is normal un- 
ia conditions becomes a pathological 
Psy en: Wre when the person is under grea 
Milica Stress. Tt would appear that a 
ng is ton of Dichotomous Evaluative Thin i 
Wd e Common characteristic of neurosis 
mie ig tional disturbance. It is felt that sut- 
kinks not caused by Dichotomous Evaluative 
si but rather that its pathological and 
leeg lve utilization accompanies pesona 
Thi és. a manifestation © 
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SUMMARY 


The hypothesis that suicidal individuals 
think more dichotomously than other emo- 
tionally disturbed people and normal subjects 
was tested by the use of the Semantic Dif- 
ferential test. 

Two measures of Dichotomous Evaluative 
Thinking were devised. They were (a) a 
value extremeness score (extremeness of value 
judgments on the Semantic Differential scales) 
and (b) a value difference score (the amount 
of value difference perceived between two op- 
posing concepts per scale). 

Three groups of 15 subjects each were uti- 
lized. They were a group of individuals that 
attempted suicide, psychosomatic patients, 
and normal hospitalized patients. All the sub- 
jects were Caucasian native-born males. Each 
subject was administered a Semantic Differ- 
ential form consisting of 18 concepts with 
nine scales for each concept. 

The results indicate that for the value ex- 
tremeness and difference scores, the suicidal 
and psychosomatic subjects earned higher 
scores than the normal subjects. No statisti- 
cal differences were found between the sui- 
cidal and psychosomatic groups on the two 
types of scores. 

It was concluded that Dichotomous Evalua- 
tive Thinking seems to be a common charac- 
teristic of emotionally disturbed persons and 
was not an exclusive factor in the thinking 
of suicidal individuals. 
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EVALUATION OF PSYCHOTHERAPY AS AN ADJUNCT TO 
INSULIN-COMA THERAPY? 


PHILIP ROOS 
Board for Texas State Hospitals and Special Schools 


There are three main viewpoints prevalent 
today concerning the value of psychotherapy 
as an adjunct to the somatic therapies: (a) 
some form of psychotherapy must be com- 
bined with somatic therapy if genuine im- 
provement of the patient is to occur (Palmer 
& Riepenhoff, 1950); (6) it is undesirable to 
combine psychotherapy with somatic therapies 
since somatic therapies may actually render 
the patient less accessible to psychotherapy 
(Frank, 1950; Rosen, 1953; Sullivan, 1940) ; 
and (c) psychotherapy contributes little or 
nothing to the treatment of the patient over 
and above such methods as insulin-coma ther- 
apy (Kalinowski & Hoch, 1952). 

The present study sought at least a partial 
answer to the following question: to what ex- 
tent does psychotherapy contribute to the im- 
Provement of schizophrenic patients who are 
undergoing insulin-coma treatment? Studying 
the effects of psychotherapy in conjunction 
with insulin therapy and the total treatment 
program that is associated with it furnished 
an unusually rigid test of the value of psycho- 
therapy. In this case, the comparison was not 
between a group receiving psychotherapy and 
a group receiving no specific treatment, but 
it was between a group receiving insulin ther- 
apy as a part of a general treatment program 
and a group receiving this plus psychotherapy. 


METHOD 


Essentially, the research design consisted of com- 
paring the improvement made by two groups of 


1 This paper is based upon a doctoral dissertation 
done at the University of Texas, January 1955, The 
writer wishes to express his appreciation to Wayne 
Holtzman for his guidance and enthusiastic support. 
Recognition is also due the staff of the Waco Vet- 
erans Administration Hospital where this study was 
conducted. 
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schizophrenics undergoing deep insulin-coma theren 
The experimental group differed from the contro 
group in only one variable; namely, the addition 0f 
individual and group psychotherapy to the total T 
sulin treatment program. The progress made by t 
patients was assessed by diverse techniques judea 
to be objective, quantifiable, and meaningful. aa 
judgments and ratings of improvement were cy D 
ducted in such a way as to eliminate contamina ma 
of possible bias on the part of the investigators. | cts 
psychotherapy used with the experimental subida 
was studied by means of recordings, therapists’ notes 
and quantified therapists’ ratings. 


Subjects 


Over a 9-month period complete data were gi 
ered on 19 experimental and 18 control casen ie 
cases were male veterans in the Veterans agna E 
tration Hospital, Waco, Texas. As patients were aa 
mitted to the insulin service, they were randomly 
signed to the experimental and control groups- two 
significant differences were found between the sy- 
groups in age, education, or chronocity of sei 
chiatric condition.? Pretreatment ratings on a þe- 
based on psychiatric interviews, observed war ate 
havior, and psychological tests clearly demonstra” 


* rio! 
the presence of psychopathology in all patients P' 
to therapy.3 


Insulin Treatment Service 


of 
The insulin service accommodated a maximum ore 
16 patients so that each group consisted of nO earch 
than 8 patients at any given time. All the Tes% ed 
patients lived on the same ward and were oy cial 
to the same ward personnel, psychiatrists, m? 
worker, and psychologists, as well as to tbe ecial 
general environment, They were housed in & Pom 
wing which Provided treatment room, day F Pi 
and sleeping quarters, m? Z 
Typically, the patients underwent insulin 6" oP 
days a week, In Tare cases insulin coma was om 
bined with electric shock treatment, or the PA 
= +. av 
*A detailed description of these subjects 15 
able in Roos (1955). ibe i 
3 The scales and rating procedures are descr! 
a later section of this paper. 
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Were interspersed with shock treatments.4 These 
Modifications in treatment were determined by the 
Psychiatrist who kept a very close watch on each 
Patient. The overall treatment time was set at 10 
weeks by the psychiatrist. so that all patients would 
ave the same amount of somatic therapy and the 
Same length of time on the intensive treatment 
Service, 

y Occupational therapy, recreational therapy, and so- 
cial service functions were available to all patients. 
or the duration of the research project, however, 
the social Service worker limited himself to discussing 
Superficial reality problems with the insulin patients 
in order to avoid possible contaminating effects. 


Evaluative Criteria 


Three types of evaluative measures were used: (a) 
Scales based on a psychiatric interview and observed 
havior On the ward, (b) scales based on psycho- 
Bical test data, and (c) length of stay in the hos- 
tah following termination of the insulin coma treat- 
, Multidimensional Scale for Rating Psychiatric Pa- 
sent’ (MSRPP), The MSRPP is divided into two 

tions, One referring to a psychiatric interview and 
ae er focusing upon ward behavior as judged by 
oa and Psychiatric aides (Lorr, 1953). Prelimi- 
reli Studies utilizing this scale report satisfactory 

lability and validity (Lorr, 1953; Lorr, Jenkins, & 


bighePPls, 1954). The scale yields an overall mor- 
ity Score indicative of degree of pathological 
Variatio, 


n from “normal” behavior. 

i H those using the scales were carefully instructed 
© Proper rating procedure. Common rating er- 
tion Sen as halo effect, central tendency and posi- 
chi aap Were discussed in group meetings. The psy- 
quested In charge of the insulin service was 2 
his E to use the MSRPP Interview scales to recor 
View Pressions of the patient based upon an inter- 
ten Lenor to onset of insulin treatment and asin 
syeh S later when treatment was terminated. 1 : 
lhrougp t was in close contact with each patient 
Prior ehout his treatment as well as for several wee : 
Which © its initiation. He was unaware, however, 0 
fu n Patients received psychotherapy and was care- 
Veal fo elicit material from them which might = 
Possibile information to him, thus preventing aae 
Tents Y of unconscious bias influencing his judg 


iy 
sent MSRPP Ward Observation scales were pre- 
Ang |, (0 the nurses, aides, occupational therapists, 


r : 
Fate cat ational workers, and they were asked to 


dai, 2° patient biweekly. These workers had close 
Shou) ntact with the patients so that these ratings 
tia] © revealing of the patients’ behavior in so- 


ee Bittar 
Rive, rations and on the ward. These raters yee 
nly „e impression that the research was concerne! 


Ulin With Intensive study of patients undergoing in- 
Teatment, Thus, independent behavioral ratings 


aly : 
“eq E treatment data on each patient are summa- 
n Roos (1955). 
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obtained from the ward personnel provided unbiased 
criteria of therapeutic success. 

Psychological Tests. Each patient was administered 
a battery of psychological tests before the beginning 
of treatment and again upon its completion, These 
tests were given by a Psychologist other than the 
experimenter, who was unaware of whether a given 
patient was in the control or experimental group. 
The battery included the following group-adminis- 
tered tests: (a) the Grayson Perceptualization test, 
(b) the Kent EGY intelligence test, (c) the Human 
Figure Drawing Test, (d) the Bender-Gestalt, (e) a 
sentence completion test, and (f) a modification of 
the Wechsler-Bellevue Picture Completion Test. All 
research patients were also given the Rorschach test 
before and after treatment. This test was adminis- 
tered individually with a standard inquiry following 
the technique outlined by Klopfer and Kelley (1946). 
Each protocol was then scored, the Psychogram was 
plotted, and the main ratios computed. 

Test interpretation scales intended to quantify sig- 
nificant aspects of personality which might be ex- 
pected to show changes as a result of psychotherapy 
were devised by the investigator. Two sets of rating 
scales were developed. One set, consisting of five 
steps, was designed to assess the patient’s function- 
ing in certain areas at the time of testing. The other 
set, consisting of seven steps, was aimed at estimat- 
ing degree of change taking place during a lapse of 
time in the same areas of functioning. The first set, 
referred to as Cross-Sectional Rating scales, was ap- 
plicable to a single test. The second set, referred to 
as Comparison Rating scales, was applicable to com- 
bined pre-posttests. The names of the seven scales 
follow: 5 Intellectual Efficiency, Emotional Function- 
ing, Psychopathological Characteristics, Interpersonal 
Efficiency (Nature of Social Contacts), Interpersonal 
Efficiency (Need Gratification), Self-Attitudes, and 
Psychosexual Level. 

Three clinical psychologists served as raters; all 
had considerable experience in psychodiagnostics, The 
scales were discussed in detail with them. Only after 
the scales were clearly understood and considered 
unambiguous by all raters were they applied to the 
research cases. 

Cross-Sectional Ratings. All identifying data were 
removed from the test items as well as any infor- 
mation pertaining to whether the given material: was 
obtained before or after treatment. The group tests 
were treated as one unit, and the Rorschach was 
treated as a separate unit. The raters were first pre- 
sented with a set of group tests from 10 patients, or 
a set of 10 Rorschachs, consisting of pre- and post- 
tests and both control and experimental cases, They 
then rated each Rorschach and each set of group 
tests (which were treated independently and without 
knowledge of which group tests belonged with which 
Rorschach) on the Cross-Sectional Rating scales. 
Following this procedure, which was repeated with 
several sets of test protocols for 10 patients at a 


©The actual operational definitions for each step 
on the scales can be found in Roos (1955). 
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time, each rater was handed a set of 10 Rorschachs 

aired with the corresponding group, tests. Again the 
Rae s were unaware of whether an individual set of 
ote tad been gathered before or after treatment or 
whether it belonged to an experimental or control 
i ae E Ratings. The next step in the rating 
procedure consisted of giving each rater a set of 10 
pairs of Rorschachs or 10 pairs of group tests, each 
pair consisting of the pretreatment and posttreatment 
protocols for a single patient. He was told which of 
each pair was pre- and which was Posttreatment and 
was requested to complete the Comparison Rating 
scales. As in the previous rating sessions, each set of 
10 contained both experimental and control cases, 
selected randomly. 

Finally, after evaluating several sets of pre-, post- 
Rorschachs and group tests, each rater was handed 
a set of 10 complete pairs, each pair consisting of 
the pretreatment Rorschach and corresponding group 
tests and the posttreatment Rorschach and group 
tests of the same patient. The rater was told which 
were the pre- and which were the posttreatment tests 
and was requested to complete Comparison Rating 
scales on the basis of the entire test battery. 


Psychotherapy 


The experimental group met three times a week 
for group psychotherapy sessions conducted by the 
experimenter and a staff psychologist. Each of the 
therapists met the group alone once a week, and on 
the third day both acted as cotherapists. In addition, 
the patients were seen for individual psychotherapy 
sessions by the experimenter and staff psychologist. 
They were assigned randomly to each of the psycho- 
therapists for these individual sessions. The time and 
number of individual sessions per patient were deter- 
mined in part by the therapist’s judgment and in 
part on the basis of the patients’ requests for indi- 
vidual sessions. On the average, two weekly indi- 
vidual meetings were held, All experimental patients 
were told they could request individual sessions as 
their needs arose. 

The therapists made notes on each patient’s be- 
havior immediately following every therapy session 
(group as well as individual), and each group ses- 
sion was briefly summarized in terms of basic themes 
and tensions. Many of the sessions, both individual 
and group, were recorded and filed.6 


RESULTS AND DISCUSSION 


The experimental design of this study was 
aimed primarily at furnishing at least a par- 
tial answer to the question: can the effects of 
short-term psychotherapy used as an adjunct 
to insulin-coma treatment be measured over 
and above the effects of insulin-coma therapy 
alone? By comparing the changes in relevant 


ë Sample transcriptions may be found in Roos 


(1955). 
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personality variables between pretreatment 
and posttreatment evaluations of the experi- 
mental patients with the changes found in the 
control patients, it should be possible to de- 
termine whether the group receiving psycho- 
therapy in addition to insulin improved sig- 
nificantly more than the group receiving in- 
sulin alone. 


Analysis of Measures of Personality Used as 
Criteria for Evaluation of the Effects of Psy- 
chotherapy 


Two kinds of criteria are available for 
evaluating the effects of psychotherapy: (a) 
ratings on a number of personality traits made 
by clinical psychologists after careful study 
of individual protocols of psychological tests 
administered before and after treatment, and 
(b) changes in ward behavior and psychiatric 
interview as quantified by scores on the 
MSRPP. 

Psychological Test Ratings. The psycho- 
logical test scales were analyzed with respect 
to two main factors: (a) the degree of agree- 
ment between two independent raters on each 
scale, and (b) significance of changes indi- 
cated between pre- and posttreatment testing: 
When the interrater agreement for both scales 
involved in a pre-post treatment or intergroup 
comparison was found to be at least .50, the 
Separate ratings of each psychologist were 
pooled and averaged. When one or both inter- 
rater correlations were less than .50, the rat- 
ings showing the greatest variability as indi- 
cated by the largest standard deviation wet 
used as criterion measures of personality. 

The cross-sectional ratings produced ora 
and more inconsistent reliability than t { 
comparison ratings. In general, the degree 
agreement between the two raters was bet z 
than chance on both cross-sectional and ©° a 
Parison ratings, although a few correlation” 
Were near zero or negative. Comparison 0 en 
number of significant correlations betw 
raters judging on the basis of group be 
alone, Rorschach alone, or a combinatio” ;ț- 
Sroup tests and Rorschach, revealed little he 
ference. Thus, it seems that increasinS "<6 
amount of available data on which t0 ec? 
ratings did not increase the degree of a% 
ment between raters. This finding raise® ce 
important question for further research, 
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many clinicians believe that personality evalu- 
ations based on a comprehensive battery of 
tests are more reliable and valid than those 
derived from single tests. 

Comparison of the mean pre- and posttreat- 
Ment scores obtained from the experimental 
and control groups on each of the Cross-Sec- 
tional Rating scales revealed marked differ- 
ences in the ratings and what they signify 
from one set of tests to the next (e.g., group 
tests alone, Rorschach alone, and group tests 
combined with Rorschach). From a total of 
nine significant (.05 level or beyond) differ- 
al between pre- and posttests, eight were 

‘rived from ratings based on the group tests 
alone and one was derived from ratings based 
on the group tests and Rorschach combined. 

t appears from these data, therefore, that the 
Stoup tests were quite sensitive to changes oc- 
curring during treatment and that ratings de- 
"ved from them may be of value for inter- 
Sroup comparisons. On the other hand, the 
Rorschach tests—or at least judgments de- 
rived from them by the raters in this study— 

1d not significantly reveal these changes. One 
Possible interpretation of these results is that 
he Personality characteristics measured by 
cal Orschach are “deep” enduring genotypi- 
traits which seemingly remain unchanged 
sibs Tesult of treatment. Another more pos- 
ca © interpretation, however, is that the clini- 
regents based on the Rorschach do not 
ti Cct the Personality characteristics in ques- 

On, 


tic Parison ratings on all scales, el 
indict the data upon which they were ar : 

e ted improvement, suggesting that t o 
an Suitable for comparing the experimenta. 
flict Sntrol groups. These results seem to con- 


tion, th those derived from the panes 
tatin, ‘tings which indicated that only tho š 
Dro ased on the group tests revealed im 


data were pretreatment and which 
Sathered after treatment greatly influ- 
€ raters who believed, in general, that 
Patients were improving. Rater bias 
Stoup, “fect the experimental and am 
D chy ©qually, however, since the raters pan 
tom aS as to which protocols were obtaine 
© experimental group and which were 
Tom the control group. 


MSRPP Ratings 


A measure of psychopathology was derived 
from the MSRPP by combining the ratings 
based on the psychiatric interview with the 
average of the four ratings made by the ward 
personnel. Comparison of pre- and posttreat- 
ment ratings revealed a significant decrease 
in pathology as measured by the MSRPP (be- 
yond .001 level). These results, however, do 
not permit any generalizations regarding in- 
sulin-coma therapy, since the present study 
was not designed to evaluate this type of 
treatment. 


Comparison between Experimental and Con- 
trol Groups 


On the basis of the preceding analysis of 
measures used in this study to assess person- 
ality changes, the following types of data were 
found to reveal changes between pre- and 
posttreatment evaluations: (a) the cross-sec- 
tional ratings based on the group tests, (b) 
the comparison ratings based on all cate- 
gories of psychological test data, and (c) the 
MSRPP measure of pathology derived from 
psychiatric interview and observed ward be- 
havior. These measures are suitable, there- 
fore, for testing the hypothesis that the ex- 
perimental group improved more than did the 
control group. 

Psychological Test Ratings. The mean pre- 
posttreatment changes on the cross-sectional 
ratings for the experimental and control 
groups, together with the differences in these 
changes between the two groups and their 
statistical significance, are summarized in 
Table 1. Since both control and experimental 
patients showed some improvement, only the 
cross-sectional ratings for which there exist 
significant differences between the pre- and 
posttreatment testing for either the control or 
the experimental group are included in this 
analysis. 

Consideration of these data reveals that all 
differences in improvement between the ex- 
perimental and control groups favor the ex- 
perimental group; that is, the ratings on those 
scales which differentiated significantly be- 
tween pre- and posttreatment testing reveal 
more improvement in the experimental than 
in the control group. Ratings based on the 
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TABLE 1 


Comparison or EXPERIMENTAL AND CONTROL GROUPS ON CROSS-SECTIONAL RATING SCALES 


Experimental Group 
Mean Change 


Control Group 
Mean Change 


Rating Scale Score Score Difference* l 
Group tests: A 
Intellectual Efficiency LU 0.69 0.4 1. : 
Emotional Functioning 1.00 0.25 0.75 ee 
Psychopathological Characteristics 1.42 0.31 1.11 4.20 
Interpersonal Efficiency— 294" 
Nature of Social Contacts 1.16 0.37 0.78 “30 
Need Gratification 0.84 0.25 0.59 1. 
Self-Attitudes 0.95 0.63 0.32 1.02 
Group tests and Rorschach combined: 
Interpersonal Efficiency— ie 
Nature of social contacts 0.37 0.13 0.23 0. 


here. 


3 Positive values indicate differences fa 
* Indicates significance beyond .05 leve 


* Indicates significance beyond .01 „evel. 
*#* Indicates significance beyond .001 level. 


group tests for three scales—Emotional Func- 
tioning, Psychopathological Characteristics, 
and Nature of Social Contacts—reached an 
acceptable measure of statistical significance 
(0.05, 0.001, and 0.01 level, respectively) . 
Ratings on Need Gratification approached 
significance (beyond 0.10 level). 

These findings lend strong support to the 
hypothesis that short-term psychotherapy used 
as an adjunct to insulin-coma therapy leads 
to measurable improvement in schizophrenics 
over and above the changes resulting from in- 
sulin-coma treatment alone. If the psycholo- 
gists’ ratings on these three scales actually re- 
flect the traits as defined, the results indicate 
that short-term psychotherapy helped schizo- 
phrenic patients to learn more mature ways 
of handling their emotional reactions, to aban- 
don primitive psychotic thinking, and to im- 
prove their ability to function well in social 
situations. These are the kinds of changes 
which most psychotherapists would probably 
expect to find if psychotherapy were at all 
successful. 

Although none of the differences between 
the experimental and control groups on the 
comparison ratings reached statistical signifi- 


Note.—Only scales where comparison of pre- and posttreatment ratin: 


» presented 
gs gave significant t's (beyond .05 level) are prest 


voring experimental group. 


cance, all the ratings based on the group ee 
alone favored the experimental group, wists 
ratings based on the Rorschach tended to 
vor neither group consistently. H 
MSRPP Ratings. Comparisons of the “a 
perimental group and the control group bee, 
respect to improvement as reflected by changi 
from the MSRPP revealed that the ener 
mental group showed a greater decree 
pathology than the control group. The ¢ a, 
failed to reach Statistical significance, hon 
ever, suggesting there is only a trend favors 
the hypothesis that the experimental ie 
made more progress than the control gr? 
Post-insulin Hospitalization. Since t 
MSRPP ratings were made shortly follow’ 
termination of insulin therapy, the tem! 
rary leveling effects of the insulin may i 
overshadowed group differences which W? 
become evident after the immediate effects y 
the insulin had dissipated. A follow-up st e 
of these and similar cases might reveal P4] 
striking differences between the experim®! i 
and control groups than those found m 5 
study which compared the groups imm 
ately following treatment, Preliminary ath? 
low-up data based on a survey made 2 m0 


ng 
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following the present study lend support to 
this hypothesis. At that time only three of 
the experimental patients had not been either 
'scharged from the hospital or released on 
trial visit as compared with eight of the con- 
trol patients, The average number of days 
an experimental patient remained hospitalized 
ollowing the termination of insulin treat- 
Ment was 38.4 as compared with 72.2 days 
for the Control patient. Since the distribution 
; numbers of days was markedly skewed, a 
°nparametric method—the Mann-Whitney U 
fi. Was used to determine the significance 
Dati Ne difference between the two groups of 
a A U of 1.94 (p less than 0.05) was 
ta from which it can be concluded that 
mor patients receiving psychotherapy are 
are pay to leave the hospital earlier than 
i se who receive insulin alone. This find- 
e ins particularly important in view of 
icy (Ct that the psychiatrist responsible for 
Wher Eg the patients was unaware of 
€r any given patient had received psy- 
erapy, 


SUMMARY 


ine te Present study was an attempt to test 
Useq YPothesis that short-term psychotherapy 
tesy] P an adjunct to insulin-coma reai] 
Phrenj in measurable improvement In sc “i 
Sulin €s over and above that derived from a 
Schizo cttment alone, A control group of 1 
apy Phrenics undergoing insulin-coma ther- 
nn A compared with an experimental 
Comp; °f 19 similar patients who received a 
thera ation of group and individual psycho- 
valk 'n addition to the insulin treatment. 
Dost ative criteria included: (a) pre- an 
the pp atment ratings by a psychiatrist on 
Mgppsis of an intensive interview using the 


f the €. using the Ward Observation scale 
Ment Msrpp. and (c) pre- and posttreat- 
uly qetings made on the basis of individu- 
Ministered Rorschachs and a battery 
iting D-administered psychological tests. A 

2 arte made in such a way as to be free 
Compare” rater bias. + ase 


e Son of the improvemen 
fy rir 9 A Nprov 
tal and groups led to the 
control g 


Mone 
Win, : 
Major findings: (a) the 0 
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psychiatric and behavioral ratings revealed 
differences between the improvement made 
by the two groups which consistently favored 
the experimental group, but these trends did 
not reach statistical significance; (b) the psy- 
chological test ratings, which had differenti- 
ated between pre- and posttreatment testing, 
revealed significantly greater improvement in 
the experimental group than in the control 
group on three important scales; (c) ratings 
based on comparisons between pre- and post- 
treatment protocols failed to differentiate be- 
tween the mean improvements made by the 
two groups; and (d) comparison of the two 
groups on length of time patients remained in 
the hospital following termination of insulin 
treatment revealed that the experimental pa- 
tients—on the average—left the hospital more 
than a month sooner than the control pa- 
tients, and this difference proved to be sta- 
tistically significant. These findings were in- 
terpreted as verifying the major hypothesis. 
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A NOTE ON THE SIGNIFICANCE OF DISCREPANCIES 
BETWEEN GOODENOUGH AND BINET IQ SCORES 


ELISE ELKINS LESSING 


Illinois Institute for Juvenile Research 


Several investigators have tried to ascertain 
the significance of discrepancies between in- 
tellectual ratings derived from the Good- 
enough Draw-A-Man Test on the one hand 
and standard intelligence tests on the other. 
It has been suggested that a negative devia- 
tion of the Goodenough IQ from other rat- 
ings might be an indication of either brain 
damage or emotional maladjustment (Good- 
enough, 1950). Hinrichs (1935) and Brill 
(1937) found support for this hypothesis in 
their comparisons of Binet and Goodenough 
IQ scores in populations of delinquents and 
defectives, respectively, Hanvik (1953) com- 


Ser research proj- 
ect, the present author obtained evidence that 


the discrepancy between Goodenough IQs and 
IQs on a standard test of intelligence can be 
as dramatically large in a nonclinic popula- 
tion as in the disturbed Sroups studied by 
Hinrichs and Hanvik. The author had access 
to male figure drawings made by 21 boys and 
2 girls who were described as well-adjusted 
by their teachers in the Chicago public 
schools. These children were part of a larger 
sample used in an Institute study of normal 


8 and 9 year old children. One experienced 


examiner administered Form L of the Re- 
vised Stanford-Binet to all the children and 
scored the protocols. The same examiner ad- 
ministered ‘the drawing test. The word “per- 
son” was substituted for “man” in the Good- 
enough instructions: however, the encourage- 
ment to do the best job possible was retained. 
The drawings were scored by the author and 
a trained student worker. A reliability coeffi- 
cient of .95 was obtained. The student’s scor- 
ing was used as the basis for all subsequent 
computations, 

The mean Binet IQ of the 23 nonclinic chil- 
dren was found to be 119.96 with an SD of 
14.84. The mean Goodenough IQ of the group 
was found to be 92,17 with an SD of 17.26. 
The discrepancy of 27.79 IQ points betwee? 
the mean Binet and the mean Goodenough 
IQ of this nonclinic group is actually large! 
than the discrepancy of 13.72 points which 
Hanvik found between the WISC and Good- 
enough IQs of his clinic sample. In the non- 
clinic group being described, the correlation 
between the Binet and Goodenough IQ scores 
was 51, which is significantly different fro™ 
level, F 
n is somewhat less grati- 
s discussion would indi- 


rn of test score ae 
n populations presumab 7 


ica 
acl by neither of these patholog}¢ 
Conditions, 
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BRIEF REPORTS 


CASE HISTORY DATA AND PSYCHIATRIC DIAGNOSIS! 


EDWARD ZIGLER AND 


Yale University 


This study investigated how individuals in each 
of the major groups of functional disorders dif- 
fered from the population at large and from one 
another on the variables of age, intelligence, edu- 
cation, occupation, employment history, and mari- 
tal status. The study was based on an examina- 
tion of the case history data of 793 patients 
admitted to Worcester State Hospital during a 
12-year period (1945-1957) and referred to the 
hospital Psychology Department for appraisal. 
The diagnosis ascribed to each patient was that 
psychiatric classification agreed upon at a diag- 
nostic staff conference. The patients were cate- 
gorized into four major diagnostic groups: manic- 
depressive (37 men, 38 women); schizophrenic 
(165 men, 122 women); psychoneurotic (81 men, 
71 women); and character disorder (197 men, 
82 women). The 1950 census data for the state 
of Massachusetts was employed to compare the 
total hospital population and the specific diag- 


nostic groups to the population at large on the 
variables investigated, 


An effort was made to d 
systematic differences on 
ables existed between the 
the study and a sample o 
not been referred for 
This latter sample was 


etermine whether any 
the biographical vari- 
population employed in 
f 111 patients who had 
psychological appraisal. 
matched with the present 
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he Dementia 


Worcester State Hospital 


population on the variables of sex, year admitted 
to the hospital, and diagnosis received. This com- 
parison revealed that there were no differences 
between the two groups on the variables of a 
cupation, education, or employment history. ia 
the age variable, it was found that the popu "4 
tion used in this study is younger than the een 
eral hospital sample. In the marital status vari 
able, it was found that patients who are mE 
are equally represented in both samples but tha 
married patients are under-represented while i” 
dividuals who fall into the “other” category fe 
over-represented in the population used in thy 
study. No comparisons could be made on the in 
telligence variable. These differences should i 
kept in mind in generalizing the findings of a 
Present study to a random population of patien 
suffering from functional disorders. 4 t 
The most striking finding of this study is a: 
the hospital Population and the diagnostic group 
that comprise jt are not representative of ae 
Population at large, In general, hospitalized in 3 
viduals are drawn from that portion of the pop“ 
lation which has the most difficulty in meeting 
social expectancies, ai 
The results also indicate that the individu 
diagnostic gr 
large and fr 
ways. T] 


groups mo 
the normal 
the schizop! 
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HOMOSEXUAL PREJUDICE AND PERCEPTUAL DEFENSE ' 


LOUIS BREGER anp SHEPHARD LIVERANT 
Ohio State University 


In accordance with Adorno, Frenkel-Brunswik, 
evinson, and Sanford’s (1950) conceptualization 
ee Prejudice (i.e., prejudice serves as & defense 
against one’s own unacceptable impulses) we pre- 
dicted that individuals scoring high on a measure 
OF homosexual prejudice would show greater in- 
Ices of threat to homosexual words presented in 
è perceptual defense situation. 
$ n the basis of their scores on a specially con- 
a Aii Likert-type scale of manifest attitudes 
Lan ar „homosexuality (H scale) male subjects 
_ Were divided into a high prejudice group (N 
~21); a median group (N =25) and a low 
GA (N = 22), Following individual administra- 
a of the H scale each subject was placed in a 
i, ot defense situation utilizing the succes- 
mea carbon method of presentation. The final 
Sure of threat was obtained by comparmg 
a Means of each subject on four homosexual, 
Y sexual, and eight neutral words. 
E erceptual defense of the avoidance type was 
a pnstrated on the homosexual and semnal words 
€ total, group (t= 5.65, p <.001 between 
an and homosexual words; £ = 7-50, $ < 001 
/  g Ween neutral and sexual words). However, @ 
Bnificant difference was not found between the 
°sexual and sexual words. 
© differential reaction times to the three 
Ps of words by the high, middle, and low H 
© Scorers indicated that group differences 10 


Brou; 
Scal 


1 . 
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epa! 
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mittin Library of Congress; Washington 25, D: y 
8 in advance $1.25 for microfilm or $12 
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defensiveness to the homosexual words of either 
the avoidance or vigilance type were ot mani- 
fested. 

These results consistently fail to support the 
major hypothesis concerning prejudice and de- 
fensiveness. However, assuming the adequacy of 
our measures, an alternative explanation presents 
itself, The adequacy of the perceptual defense 
test as an index of threat tends to be substan- 
tiated by the significantly longer reaction times 
on the taboo words. The high test-retest reliabil- 
ity coefficient (Pearson r = .91, N = 43), the re- 
sults of an item analysis, and the successful con- 
trol of an acquiescence set all indicate that the 
H scale is measuring some variable in a meaning- 
ful manner and a consideration of the undis- 
guised nature of the items makes it appear likely 
that H scale scores do reflect manifest attitudes 
toward homosexuality. 

The alternative explanation, namely that the 
attitudes reflected in the H scale are for most 
subjects learned stereotypes, appears to account 
for the obtained results. The negative results with 
the perceptual defense measure are not unex- 
pected, since the notion of learned stereotypes 
involves no assumptions regarding homosexual 
prejudice as a defense against repressed impulses. 
Further corroboration of the stereotype hypothe- 
sis is provided by the finding that fraternity mem- 
bers and applied majors score significantly higher 
on the H scale than nonfraternity men and liberal 
arts majors, since the former groups would be 


more likely to hold stereotyped attitudes of all 


kinds. 
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IMPROVING THE FACTORIAL PURITY OF GUILFORD’S ; 
RESTRAINT AND THOUGHTFULNESS SCALES * 


A. W. BENDIG 
University of Pittsburgh 


Previous factor analytic studies of the 10 scales 
of the Guilford-Zimmerman Temperament Sur- 
vey have shown that the Restraint (R) and 
Thoughtfulness (T) scales have an intercorrela- 
tion of approximately .40 and that these scales 
apparently define one of the four second-order 
factors in the GZTS: the factor of extraversion- 
introversion (EI). In order to develop a short 
and factorily saturated EI scale for personality 
research the 60 R and T scale items were sub- 
jected to four overlapping factor analyses to de- 
termine the factor loadings of each item. Within 
each analysis the items were intercorrel 
the phi coefficient, two centroid facto 
tracted using the complete centroid 
the loadings were analytically rotate 
simple structure by the oblimax criterion.2 Three 
groups of college subjects were involved in the 
analyses: Group I consisted of 300 male fresh- 
men and Group II of 130 male freshmen who 
completed the full length GZTS inve; 
Group III was composed 
sexes enrolled in introd 


ated using 
TS were ex- 
method, and 
d to oblique 


The R and T scale score 


S of the Group I sub- 
jects were summed and eac! 


h item correlated with 
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this EI factor score. The 30 items correlating. 
most highly were administered to Group III an 
these items factor analyzed, The 60 R and f 
items were divided into two subsets of 30 ted 
(15 R and 15 T) and two factor analyses Pe 
formed. The correlations between the R and A 
Scores in both groups were .42, but the ave 
(median) Correlation between the two obliq t 
factors was only .17. The 40 items with the mo 
Consistent and largest factor loadings were $° 
lected for analysis using the item responses 0 
Group II. The correlation between the R and d 
factors was .21 and only one of the items i 
loadings inconsistent with the preceeding analy 
ses. All four analyses showed certain items tO 
incorrectly keyed as to factor affiliation. re 
© responses of the Group III subjects We 
scored for two pairs of R and T scores: im- 
regular 30-item scales using the Guilford-Zi 
merman keying and new 20-item scales ss ktor 
item keying suggested by the first three fac 4 
analyses, The reliabilities of all four scores Me 
aged about .71 (.69 to .73), but the correlatio 
between the 30-item scales of .35 dropped to h 
nonsignificant .16 between the new 20-item scales 


n 
It was concluded that the moderate correlato 
between th 


the factorily im 


P 3 en 
cre Js only a small correlation betwee, 

the R and as also suggested we 

v-order factor may be solely 4 P 


contaminated sale 


ey 
ven in the GZTS scale ee 
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FOOD-RELATED RESPONSES TO AMBIGUOUS 
STIMULI AS A FUNCTION OF HUNGER 
AND EGO STRENGTH?’ 


SEYMOUR EPSTEIN 


University of Massachusetts 


Ina Previous study on hunger and thematic 
*Pperception (Epstein & Smith, 1956) a theo- 
retical model was proposed which could re- 
Solve the discrepancies in a wide variety of 
Studies on the influence of hunger upon food- 
"elated Tesponses. According to this model 
Ae can produce an increment, a decrement, 

no change in number of drive-related re- 
*onses. The model integrates Miller’s (1951) 
pail of displacement and conflict with the 
NSYchoanalytic model of thinking (Rapaport, 
-_ Briefly, it is assumed that there are 
eve fundamental processes associated with 
eX ad drive state: an autistic drive-oriented 
fae tendency, and a reality-oriented 
ee tendency. It is further assumed 
of i the gradient of expression as a function 
i Nereasing stimulus-relevance is less steep 

r the 8radientof inhibition. It follows that 
Tease uli of relatively low relevance, in- 
n oes in drive should result in an increase 
pop nber or intensity of drive-relevant re- 

"ses, while for stimuli of high relevance 
ea aoe should occur. In this respect, bh 
tein On both the sex drive (Leiman SA 
mith 1961) and the hunger drive (Epstein 
f A have indicated the impon 
i zs : 

Mulus ei to the pele ae 
ro - dt is further assumed that resp 
tim, C“ Cues function in a similar manner to 


Y E im- 
Orta, us-Droduced cues, and that $t is as im-, 


leva, to consider a dimension of response- 
ance as of stimulus-relevance. In this 
“Ay 
hi j ae 
ve re Paper was presented, in part, at the Easter 


st oan Association, New York, April 1960. 
tive 8 Part of a project on the measurement 
pat mad conflict which is being supported by 

l -1293 from the National Institute of Men- 
tiati, h, United States Public Health Service. Ap- 
ie expressed for the assistance of Jane 


» Alan Leiman, and Morton Berger. 


connection, drive-relevant latent responses, 
i.e., thoughts and images, are presumed to 
produce cues which favor inhibition. Finally, 
it is assumed that there are individual dif- 
ferences in tendency to inhibit drive repre- 
sentatives which can be related to the con- 
cept of ego strength. 

The purpose of the present study was to 
investigate different categories of responses, 
as derived from the above theoretical ap- 
proach, and to determine whether a measure 
of ego strength could be related to the influ- 
ence of drive upon drive-related responses. 
The Rorschach test was investigated, for, de- 
spite the limitation of a small yield of food 
responses, it offers several scores of interest 
in terms of the model described, allows these 
responses to be obtained in a situation where 
stimulus characteristics play a minimal role, 
and affords a possible measure of ego strength. 
The following hypotheses were tested: 

1. With increasing hunger there is an in- 
crease in food-related responses up to a point, 
followed by a decrease. This hypothesis fol- 
lows from the assumption that strong cues, 
whether stimulus-, response-, or drive-pro- 
duced favor the inhibitory process. It is con- 
sistent with findings in other studies (Levine, 
Chein, & Murphy, 1942; Wispé, 1954). 

2. Food-related responses of low drive-rele- 
vance are more strongly associated with hun- 
ger than are food-related responses of high 
drive-relevance. This hypothesis follows from 
the assumption that responses that produce 
strong cues are more readily inhibited than 
responses that produce weak cues. 

3. Accurate and popular food responses are 
more strongly associated with hunger than 
are inaccurate and unusual food responses. 
This hypothesis follows from the assumption 
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Seymour 


that the reality-oriented inhibitory ape 
increasingly dominant at higher S A 5, 
least up to the point of intense drive l 
which a breakdown of controls occurs. This 
vee is consistent with ie in a previous 
i al., 1942). 
Se oe activity-responses are more 
strongly associated with hunger than food- 
related object-responses. This er is 
based upon the assumption that drive i as 
activating as well as directing properties, and 
that responses which reflect both are the best 
i resentatives. 
sme People of low ego strength demonstrate 
a stronger relationship between hunger and 
food-related responses than people of high 
ego strength. This hypothesis is based upon 
the assumption that one of the major aspects 
of ego strength is the inhibition of drive rep- 
resentatives. It is consistent with reports of 
marked individual differences in inhibiting 
thoughts about food (Sanford, 1937). 


METHOD 


Four levels of hunger were investigated by testing 
60 subjects shortly after the noon meal, 60 shortly 
before the evening meal, 30 before the evening meal 
after they had abstained from lunch, and 30 before 
the evening meal after they had abstained from 


who missed one meal 
were paid $3.00; those who mis 


lege students, 
male. Subjects 


Indicate how hungry you feel at the Present mo- 


ment, by placing a check mark to the left of the 
appropriate statement: 


— (a) Not hungry at all 


(the thought of ea 
has absolutely no 


ting 
appeal to me at 


the 


could enjoy something good) 
— (d) Hungry (the thought of food is appeal- 
ing at the moment, and even something 
ordinary would be welcome) 
— (ë) Very hungry (can’t Wi 


ait to eat some- 
thing; almost anything 


would taste good) 
Finally, groups were equated on t 
Rorschach responses (R). The final 
of 41 subjects who had not eaten for 


otal number of 
8roup consisted 
0-1 hours with 


Epstein 


jecti ale, 40 
ratings of a to c on the subjective i see if 3 
who had not eaten for 4-5 hours with as ne 
and d, 22 who had not eaten for § hours ba ay 
ings of d and e, and 21 who had not caten 
hours with ratings of d and e. i p Eo 
Responses were scored in the following catego 


i d 
1. Food Imagery—Includes all other categories 
consists of all responses with food-association es 
2. Strong Food Association—Names of etre 
foods, e.g., “fried egg,” and people cating or p! 
ing food, €g, “two people cooking” pared 
3. Weak Food Association—Names of Son | 
foods, eg., “apple”; animals cating or aeee reli 
€g, “a squirrel eating a nut”; food-rclate je in 
or implements, e.g., “pot,” “potato sack”; EE, 
activities of questionable food relevance, 
people lifting a bowl” d and 
4. Food-Related Object—Names of prepare 


: 5 “piece 0 
unprepared foods and of implements, ¢.g., 
ham,” “pot” 


5. Food-Rel, 
ing, preparin 
cooking” 

6. Instrumental 
seeking or preparin 
edible form, eg,“ 

7. Goal Respon 
or animals eating 

8. Accurate Food—Food 
which accurat 
blot 

9. Inaccurate 
pared, which doe 

10. Popular F 
Produced at leas 
the total sample 

11. Original 


Cga 


animals seek 
lated Activity—People or oe 
8, or eating food, eg, “two 

tmals 
p anima 
Responses—People or j Sa in 
g food; names of foods 
raw egg”; utensils . people 
ses—Food in edible form; F 


ared, 
» prepared or upep he 
rs 
ely corresponds to the contou 


re- 
Food—Food, prepared or moe 
s not accurately fit the KOR ío 
ood—Prepared or unprepare rea 
t six times to the same blot a 


o 
i 


o 

d fo 

Food—Prepared or unprepare | 
I 


ticular 

roduced no more than once to a particul 

area n 
7 À form 
Finally, tego Strength” was measured by the pfer: 


Zlo 
level score of Klopfer (Klopfer, Ainsworth, K 
& Holt, 1954), with the exception that tota 


shat 
sly W 

than average form-leve] Was used, as essentially > and 
was wanted was a Score of goodness of respons? ro- 


it was assumed that, holding quality constat Bian 
ducing more responses indicated more abili y seve! 
producing fewer responses. Although total sae £0 
Was directly related to R and might be expe osi the 
be directly related to number of food respony ap’ 
relationship Was actually inverse and did BE con 
Proach significance, Thus there was no basis ed. Ip 
cern over the two measures being confouns ude 
defense of the measure of ego strength, it "j 


: self 
Perceptual accuracy, integrative ability, and cd 
Posed Motivation, all of which are characte 
°S0 strength, 


Before Scorin; 


rev 
8, the data were coded eke ne 
Scoring bias, In addition, the data on foo ly? w 
responses and on °Bo-strength were separate n sl 
sented and independently scored to BEY 
founding of the measures derived from them 


thet 


d 
d, 
í 
l 
thi 
dal 
n 
d 
t 


Se EEN 
es 


| 


Food-Relaied Responses to 


RESULTS 


Seana were made of the number of 
les n in each group who fell above and be- 
ie pa cutting point for the pooled 
Bier n each category the cutting point 
sponse = to be between zero and one re- 
tonse, Two-tailed chi square tests were used 
evaluate significance. 
co 1 it can be seen that self-rated 
and ie e to 8 hours of deprivation 
Ne sam a off. The results are essentially 
used i ew the entire pool of subjects is 
subjects when the sample is restricted to 
matched screened on subjective hunger and 
ive eter R. The similarity of the subjec- 
Stoups pict ratings of the 8- and 23-hour 
Crease en that hunger may fail to in- 
S: between § and 23 hours of deprivation 
raises the question of whether the two 
ii should be treated separately or com- 
ef A provide a sample as large as for the 
e oe Accordingly, where indicated, 
i Te re analyzed both ways. — 
a oo to Hypothesis ii, which stated 
Oa sua is an increase in food imagery up 
icates and then a decrease, Figure 2 in- 
tionship a this, in fact, did occur. The rela- 
Without between total food imagery and time 
Signific food, however, is not statistically 
long nig (p= .15). If weak food associa- 
alins substituted for total food imagery, 
Ng off of responses again is indicated 
“igure 2), but the relationship now 
$ es significant (.05 level). Apparently 
Nabilit ood associations reduce the discrimi- 
When Y of the total food imagery score. 
the 8- and 23-hour groups were com- 


Stou 
bine 


oth 
th 


ti 
a 


TOTAL GROUP (N=!80) 
—— — SELECTED Ss (N=!24) 


woe 


r $ 8 23 
Hours WITHOUT FOOD 


Me. 
a function of time 


T,. Sa 
Sclf-rated hunger as 
without food. 
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—=- TOTAL FOOD IMAGERY 
WEAK FOOD IMAGERY 
—— — — STRONG FOOD IMAGERY 


%Ss 
GIVING 
RESPONSE 


HOURS WITHOUT FOOD 


Fic. 2. Total, weak, and strong food associations as 
a function of time without food, 


bined, both total food imagery (.05 level) 
and weak food imagery (.01 level) increase 
directly and significantly with increasing hun- 
ger. It may be concluded that as deprivation 
increases there is an increase in weak and 
overall food associations, up to a point, fol- 
lowed by a levelling off, or possibly decrease, 
somewhere between 8 and 23 hours of depri- 
vation. 

The finding of a significant relationship be- 
tween weak, but not strong, food associations 
and time without food is consistent with Hy- 
pothesis 2, which stated that responses of 
low drive-relevance are more strongly asso- 
ciated with hunger than responses of high 
drive-relevance. Despite this, the results do 
not entirely support the model, as according 
to the model strong food associations should 
fall off more rapidly than weak food associa- 
tions, whereas Figure 2 indicates that the 
reverse tended to occur. 

The only additional score which signifi- 
cantly (.05 level) discriminated the hunger 
groups was the food activity score, which, 
in line with hypothesis, varied directly with 
hunger. 

In order to investigate the effects of ego 
strength, form-level scores were summed for 
all nonfood responses, and a cutting point se- 
lected to divide the group as nearly in half as 
possible. This resulted in 67 subjects in the 
low form-level group and 57 in the high form- 
level group. In Figure 3 it can be seen that 
for the low ego strength group there is an in- 
crease in total food imagery from 1 to 23 
hours, whereas for the high ego strength 
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RESPONSE 


60 
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Fic. 3. The influence of ego strength upon the rela- 
tionship of food associations to time without food. 


group there is an increase followed by a de- 
crease. 

A comparison of the hi 
strength groups on all leve 
simultaneously, fails to rey 
ferences. Signifi 
however, if the 


and 4.87, df= 1). The self- 
that the high ego strength group either in 
hibits awareness of hunger or ‘ 
state of reduced hunger 

low ego strength group. 


Discussion 


It was found that with increasing time with- 
out food there was an increase in food-related 
responses followed by a decrease. However 
the relationship was significant only when 
strong food associations were eliminated 


Seymour Epstein 


Taken in conjunction with findings in other 
studies (Clarke & Epstein, 1957; Taza 
Yousem, & Arenberg, 1953; Levine et A 
1942; McClelland & Atkinson, 1948; San- 
ford, 1937; Wispé, 1954; Wispé & Dram- 
barean, 1953), it would seem safe to Con 
clude that with increasing time without foo 
there is an increase followed by a leveling © 
or decrease in at least certain types of “a 
response. This leveling off or decrease aa 
sometimes been interpreted to indicate 1a 
operation of response-inhibition at mer 
drive levels. There would be a serious di S- 
culty with such an interpretation in the pri 
ent study as the 23-hour deprivation so 
rated themselves as no more hungry thag lc 
S-hour group. Moreover, subjective me r- 
of hunger at 23 hours were more likely © A 
estimates than underestimates, as they nd 
Probably influenced by awareness of lengt a 
deprivation, Thus, the decrease in food ter” 
Sponses at 23 hours may better be a in 
preted as reflecting a leveling or falling © jve 
hunger rather than an inhibition of aw 
related responses, i.e., as indicating d" 
inhibition rather than response-inhibition. 
Some evidence possibly supporting 
Sponse-inhibition was presented by the nifi- 
ing that weak food associations were E d 
cantly associated with time without are 
while strong food associations, whic ‘ow 
More susceptible to inhibition, were not. 550° 
ever, according to the model the strong ni ly 
Ciations should have fallen off more raP od 
with increasing hunger than the weak to 
associations, whereas the reverse tende p 
occur. Thus, if the basic model is to be the 
Served, it will be necessary to give 
Consideration to the complicating € 
Fesponse-inhibition, 


jn 
The findings on ego strength offer aoe ip 
teresting evidence for individual differe h, 
inhibition, For the groups of low ego ste oot 
sete relationship was found betwee? |. 93 
related responses and deprivation thtoU? igh 
Ours of deprivation. For the groups ° ound 
ee, strength, a direct relation WaS pe 2% 
through g hours of deprivation, but t ot” 
our group Produced significantly fewe! ghê 
related responses than the 8-hour grouk 0 
low and high ego strength groups bore sel 
duced Negatively accelerated curves 


P. 
find- 


fiects ° 
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rated hunger as a function of deprivation. 
However, the 8-hour and 23-hour groups of 
high ego strength rated themselves as signifi- 
cantly less hungry than the corresponding 
groups of low ego strength. The combined 
evidence suggests, in accordance with hy- 
Pothesis, that high ego strength subjects are 
More apt to inhibit than low ego strength sub- 
Jects. If the self-ratings are accepted as a 
true report of hunger, the extremely low num- 
er of food-related responses produced by the 
‘gh ego strength group at 23 hours of depri- 
vation supports the occurrence of response- 
inhibition, i.e., holding subjective hunger con- 
tt, high ego strength subjects are less apt 
st give drive-related responses than low ego 
ength subjects. 
a very likely inhibition of food-related im- 
tery and responses serve to reduce drive, so 
ist the two types of inhibition are not un- 
a co; ed. That response-inhibition need not be 
eee age process was indicated by the ques- 
o maire at the end of the study in response 
eh almost all subjects denied suppres- 
§ food-responses. 
in, Apart from the model proposed, the find- 
nif that weak food associations varied sig- 
ae o iy with hunger while strong ones did 
he A'S of considerable interest. Coupled with 
a findings on a difference between 1n- 
Mental and goal responses, it suggests that 
mension of response-relevance is more 
ine mental than an instrumental-goal ce 
Son oy The reports in other studies (At i 
tins McClelland, 1948; McClelland & At- 
es 1948; Wispé, 1954; Wispé & Dram- 
Dovid » 1953) that instrumental en 
Dong e better indices of drive than goa re- 
in Ses can be explained by considering that 
riya nenta] responses are generally of lower 
Ove, clevance than goal responses. More- 
Sibi, C ouping responses of food n 5 i 
i laa (e.g., “wheat”) with ee 
esu ents (e.g., “table”) because ey 2 e 
Wo mably both “instrumental” to eating 
seem more forced than classify. 
n yelatively low food-relevance. 4 tke 
foog os with hypothesis, it was founc : 
Uy related activity-responses were directly 
fo d Significantly associated with hunger m 
hypo lated object-responses were not. The 
Pothesis was based upon the consideration 


ing them 
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that drive has both directing and activating 
aspects, and that associations that reflect both 
are better drive representatives than associa- 
tions that reflect only the directive aspect. 

No support was found for the hypothesis 
that popular and accurate food-related re- 
sponses increase more regularly with hunger 
than unusual and inaccurate food-related re- 
sponses. The hypothesis was based on the as- 
sumption that a reality-oriented inhibitory 
process is dominant at higher drive states, at 
least up to a point of breakdown. Levine et al. 
(1942) concluded that as drive increases, the 
organism becomes increasingly realistic in re- 
sponding to drive-relevant stimuli. McClel- 
land (1951) takes much the same view in 
hypothesizing that with increasing depriva- 
tion a “reality stage” follows a “wish fulfill- 
ment stage” which only under intense depri- 
vation is superceded by a “defense stage” 
where wish fulfillment again becomes domi- 
nant. All that can be said at this point is that 
the data are inconclusive, and that the hy- 
pothesis of increasing accuracy of drive-re- 
lated responses with increasing drive up to 
some limit, although reasonable, has yet to 
be experimentally confirmed. The evidence 
provided by Levine et al. (1942) is particu- 
larly open to question, as it is based on re- 
peated testing of five subjects without con- 
trol for practice effects, and the explanation 
was a posteriori. 

A serious limitation in the present study 
was the number and quality of food-related 
responses elicited by the Rorschach test. 
When this is considered together with the 
number of comparisons made, and taken in 
conjunction with evidence that set-effects in 
laboratory studies are apt to be more impor- 
tant than drive-effects (Clarke & Epstein, 
1957; Postman & Crutchfield, 1952; Taylor, 
1956), the need for replication under varied 
conditions is clearly indicated. That factors 
other than hunger were complicating the 
food-related responses was indicated by the 
bizarre nature of some of the responses, e.g., 
“Dante’s inferno, the bottom is the fiery 
tombs of the heretics, on the sides are the 
pigs of the gluttons, and at the top are the 
mournful souls who lived too early, the virtu- 
ous pagans.” In laboratory investigations on 
the directive influence of drive, a major diffi- 
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culty is that the magnitude of the drive is 
relatively small in comparison with other ef- 
fects. Requiring the subject to abstain from 
eating itself introduces set-effects, and in- 
forming all subjects that the study is food- 
related offers only a partial solution, as the 
information supported by abstinence has a 
different effect from the same information not 
so supported. The only solution to this prob- 
lem is either to investigate no more than 4 or 
5 hours of deprivation, or to obtain subjects 
who have not eaten for reasons unconnected 
with the study. Fortunately, there were sey- 
eral subjects who were assigned to the con- 
trol group but were eliminated because they 
had missed one or more meals, Follow- 
terviews were held with seven such s 
who had not eaten for 8-23 hours and who 
rated themselves as “hungry” to “very hun- 
gry.” Not one of these produced a strong 
food association and five produced weak food 
associations. The results 
consistent with those on 


up in- 
ubjects 


and female college st 
levels of deprivation: 
for 0-1 hours, 40 fo 
hours, and 21 for 


The major findings were as follows: 

1. Subjective hunger ratings as a function 
of time without food increased through g 
hours, but did not increase further at 23 
hours. This was inter 


preted as Suggesting 
that hunger itself may level off or 


decrease 
somewhere between 8 and 23 hours of depri- 
vation, and that drive-inhibition Probably 
occurs, 


2. Overall food imagery increased through 
8 hours of deprivation and decreased at 23 
hours. However, the relationship reached Sta- 


Epstein 


tistical significance only when strong food a 
sociations were eliminated. This was inte 
preted as supporting drive-inhibition and E 
dicating that strong food associations, sing 
they are more easily inhibited, are more aa 
ceptible than weak food associations to influ 
ence by factors other than drive. sate 
3. A group of high ego strength wor 
reported significantly less hunger at 8 and ifi- 
hours of deprivation, and produced signi 
cantly fewer food-related responses m S 
hours of deprivation, than a group of od 
ego strength subjects. Only the high + 
Strength group demonstrated a denea. d 
food responses at 23 hours. It was conclu A 
that ego strength is related to the inhibitio 
of both drive and drive-related responses. al 
4. Food-related activity-responses were is ay 
nificantly and positively related to deprivi 
tion; food-object responses were not. d 
5. Significance was not found for goal an 
instrumental food Tesponses. It was propot 
that the Soal-instrumental distinction con 
be subsumed under a dimension of driv 
relevance of the response, e- 
6. There was no evidence that food si 
Sponses become more accurate or stimulu 
determined as deprivation increases. 
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X FUTURE TIME PERSPECTIVE, 
ATIONSHIP BETWEEN FUT 
aa ME ESTIMATION, AND IMPULSE CONTROL IN A 
GROUP OF YOUNG OFFENDERS AND IN 
A CONTROL GROUP 


ARON WOLFE SIEGMAN 1 
University of Maryland School of Medicine 


Recent years have witnessed an increasing 
interest in time as a psychological variable. 
There has been concern with sources of yari- 
ance in time estimation and with individual 
differences in time orientation (Wallace & 
Rabin, 1960). The present study attempts to 
relate both of these dimensions. More spe- 
cifically, this study investigated the hypothe- 
sis that the range of one’s futu 
spective is a significant source o 
the experience of duration: 


re time per- 
f variance in 
the longer the 
perspective, the 
his hypothesis is 
mptions: that the 
S hat an interval of 


d while the author was 
at Bar-Ilan University, Israel. The author is grate- 
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Jection of the data for this study, 
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the more the subject desires that an inten 
of time pass rapidly, the longer it will appe ‘a 
to be. Generalizing from this finding one a 
Predict a positive Correlation between beer 
time Perspective—defined as the relative € E 
tance of one’s life goals—and time estimaa 
A study by Knapp and Garbutt (1958) be 
the relation between achievement gee” 
and time imagery also suggests the hypot! a 
sized relationship between future time = 
spective and the experience of duration. ith 
this study it was found that subjects wi e 
high achievement motivation described IN 
in metaphors which reflected a very rapid ine 
ternal clock, Since there is evidence which he 
dicates that need achievement is one of E 
Sources of future time perspective Cee 
in Press-b), the findings of Knapp and “a 
butt (1958) suggest a positive correlation bn 
tween the range of subjects’ future time pee 
Spective and the speed of subjects’ interna 
clock, 2 
The hypothesized positive correlation pa 
tween future time perspective and time RE 
mation scores Senerates the additional pre 


. i £ 
tion that delinquents will have shorter This 
estimation Scores th 


Jo nson, 1955) tha i 
shorter futu 
comparable nondeli 


nquents, 
A second 


o 
objective of this study was 
e relationship between a 
e time perspective. Le esti- 
f the first empirical inv es 
e€ time perspective, sup a 
hat the more thorough Y 
earns to delay the immediate eran 
f his impulses for the sake of some © 


: in one o 
gations of futur 


the hypothesis t 
child 1 


tion o 
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and more distant goal, the greater his future 
time perspective as an adult. Assuming that 
there is less learning of such control among 
lower-class than middle-class children, LeShan 
(1952) predicted and found significantly 
shorter future time perspectives among lower- 
class children. Furthermore, assuming that 
delinquents have failed to acquire such delay 
capacity, LeShan (1952) argues that they 
should also have relatively more restricted 
future time perspectives. This prediction was 
confirmed in a subsequent investigation 
(Barndt & Johnson, 1955). These findings, 
however, cannot be considered as sufficient 
evidence for the hypothesis that future time 
Perspective is a function of impulse control 
training. It is clear that delinquents and non- 
delinquents, and lower and middle-class chil- 
dren differ from each other in relation to 
Many other variables than impulse control 
training, some of which may be responsible 
for the differences obtained in relation to fu- 
ture time perspective. Consequently, the pres- 
ent study investigated the hypothesized rela- 
tionship between future time perspective and 
pulse control training in a more direct 
fashion, Actually, this study investigated the 
relation between subjects’ future time perspec- 


tive and subjects? present impulse control 
evel, 


SUBJECTS AND PROCEDURE 


thane delinquent group consisted of 30 reside! 
© Tel-Mond Prison for Young Offenders, Israel. 
ip = were selected according to alphapetiea ot- 
Brou rom the age range 17-19. The education $ t is 
SD D ranged from 1 to 8 years, with Mean ak ees 
Subjey 1.28. The nondelinquent group consiste! of 22 
tio Cts Who, in order to control for institutional2a 
the | Were selected from among recent inductees in 
Staeli Army, Subjects for the nondeliquent group 
Selected so as to obtain an age and educational 
o bution identical to the experimental group. m 
7 p o oups were also equated for ethnic origin, pe 
23% of Middle-Eastern or North African igen > 
Were $ European origin, Subjects of both group: 
Subj ower socioeconomic background. = 
by , J¢cts’ future time perspectives were determi 
+ Method similar to that used by Wallace (1956). 


Ta 

ee abject was asked to enumerate 10 event Fr 
9 things whi or which may hap- 

Den things which he may do Jetion of this 


tag, © him in th re, At the comp! 

the ip C Subject d to indicate what age a 

St he would be at the occurrence of each 3 

suja ents, The median of the differences between ae 
Sets present age and the ages indicated by the 


nts at 


Were 
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TABLE 1 


FUTURE TIME PERSPECTIVE MEAN AND STANDARD 
DEVIATION SCORES IN DELINQUENT AND 


Group N M SD 
Delinquent 30 3.10 .412 
Nondelinquent 22 4.95 All 


subject for the various events was used as the sub- 
ject’s future time perspective score. 

In the time estimation task, each subject was pre- 
sented with the following time intervals: 5, 25, and 
15 seconds. The beginning and end of each interval 
was marked by the click of a stop watch. The in- 
tervals between stimuli were 5 seconds. In order to 
control for the serial position effect found by previ- 
ous investigators (Eson & Kafka, 1952; Falk & 
Bindra, 1954; Siegman, in press-a), one half of each 
group was presented with the stimuli in the order in 
which they are listed, and the other half in the re- 
versed order. All subjects were told that the purpose 
of this study was to determine how they experienced 
various periods of time, not to count off the seconds, 
and not to make use of mnemonic devices. Two sub- 
jects of the nondelinquent group did not participate 
in the time estimation task. 

Subjects’ impulse control level was determined by 
means of a motor inhibition task. In this task, sub- 
jects were instructed to trace a 23-inch circle on 
onion skin paper as slowly as possible. This task is a 
variation on a task which was used by Singer, Wilen- 
sky, and McCraven (1956), who asked subjects to 
write certain words as slowly as possible. A factor 
analytic study, and a number of other studies in 
which this task was used as a measure of subjects’ 
impulse control level, provide considerable construct 
validity for this kind of task (Singer et al., 1956). 

Progressive Matrices (PM) scores were available 
for all subjects of the contro] group. 


RESULTS 


Table 1 indicates that the delinquent group 
obtained, as was hypothesized, lower future 
time perspective scores than the nondelin- 
quent group. Since subjects’ future time per- 
spective scores were not normally distributed, 
the significance of the difference between the 
two groups was evaluated by the Mann-Whit- 
ney U test (Siegel, 1956, pp. 116-127), The 
results were: U = 130.5, p < .0003. 

Table 2 lists the time estimations of the de- 
linquent and nondelinquent groups. As pre- 
dicted, the delinquent group obtained the 
lower time estimation scores. Because of het- 
erogeneity of variance and because the time 


474 


available cos 
Sates oe ‘motor impulse control scores 
het ae distributed, the significance 
ee 2 difeaten between the two groups was 
os ined by the Mann-Whitney U test. 
re were: U = 305, which is clearly 
ar nificant (p = .65). This finding sug- 
Pe i. delinquents are not, as is generally 
S umed, less able to control their impulses. 
The common observation that delinquents 
have a history of impulsive behavior may be 
due to the fact that 
to control their impulses rather than to de- 
fective control mechanisms. The fact that 
a socioeconomic 
background which does not provide them with 
olling their anti- 
) is perhaps one 
k of motivation. 


evidence is equivocal 


lation failed, however, to re 
tional .05 sign 


study the cor 


f = 
in the nondelinquent group, and — 21 
— .22 (p= 08), and — 21 
=.10) in the delinquent group. Again the 
correlations fail to reach the .05 level of sig- 
nificance. The fact, however, that the corre- 
lations vary consistently between the .05 and 
.10 level of significance is suggestive of the 
hypothesized relationship between impulsiy- 

ity and the experience of duration, 


SUMMARY 


A positive correlation w 
of young delinquents an 
group of nondelinquents, 
of subjects’ future time p 


as found in a group 
dina comparable 
between the range 
erspective and the 


Aron Wolfe Siegman 


speed of subjects’ internal clock 
by a time estimation task. ae 1 

The delinquent group obtained aighinoan 
lower future time Perspective scores as wel 
as lower time estimation scores, 

A significant Positive correlation was ob- 
tained, in the delinquent group, between sub- 
jects’ future time Perspective scores and their 
Scores on a motor impulse control task. The 
Correlation between these two variables in the 
nondelinquent Sroup, however, was not sig- 
nificant, 

There was no signific 
the delinquent and n 
relation to their score 
inhibition task, 

Negative corre] 
tween subjects’ 
impulse inhibitio 


as measured 


ant difference betwen 
ondelinquent group i 
s on the motor impulse 


ations were obtained be- 
time estimation and motor 
n scores, which were signifi- 
cant between the 05 and .10 level. 

Finally, the Tesults of the present study 
also suggest that there is no significant cor- 
relation between Seneral intelligence and fu- 
ture time perspective or impulse control. 
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GROUP PSYCHOTHERAPY, A SPECIAL ACTIVITY PROGRAM, 
AND GROUP STRUCTURE IN THE TREATMENT 
OF CHRONIC SCHIZOPHRENICS? 


JAMES M. ANKER ann RICHARD P. WALSH? 


Veterans Administration Hospital, Perry 


Activity programs and group psychotherapy 
frequently have been used in the treatment of 
the chronic patient. Evaluations of these pro- 
cedures, unfortunately, often have suffered 
from poor definition and choice of procedure, 
and inadequate design. This Paper reports an 
experimental study of these therapies in com- 
bination with the effect of 

The use of activit 


method in the 
spital patients 
Ors since have 
ccess with this 
ssive variations 


agement. 

Scher (1957a, 1957b) exten 
cept considerably with 
“task orientation” of 
sponse to the task was 


) ded this con- 
his emphasis upon the 
the patient, When re- 
demanded rather than 
1A review of this paper Was presented 
at the 
annual Research Conference, Atth 
therapy Studies in Psychiatry 
proaches to Mental Illness, 
1960, and at the eleventh sem} 
ministration-Universities Con 
C., December 1, 1960, 
2 Formerly counseling psychologist, 

Service, Veterans Administration Hosp 
Point, Maryland. Now Chief Counselor and Assist- 


ant Professor in Psychology, Testing and Counseling 
Center, University of Cincinnati, 


ans Ad- 
Washington, D. 


Psychology 
ital, Perry 


Point, Maryland 


being requested Scher concluded that signifi 
cant therapeutic changes were produced. 
controlled study by DiGiovanni (1958) ee? 
to replicate Scher’s results when the activ! : 
sroup was compared with a psyjchotherit s 
Sroup and a control group, and all groom 
Were compared before and after. The tre? 
ment procedures were conducted for any, s 
months, however, as compared with 12 mon 
in Scher’s study. 
Members o 


ado t groups ring é rally 
ri I occurring natu 
adopt res 


0 
Ponsibility and/or develop Ways 


I m- 
delegating it to themselves and other mion 
bers, They 


participate in a general epee 
of labor. Kretch and Crutchfield (1948) § at 
this mutual cohesiveness, as opposed me up: 
ternal pressure, is what constitutes 4 Bi or- 
It was presumed that this type of socia izo- 
ganization could occur in a chronic a 
phrenic group and, to the extent that en 
occur, behavior alteration would be Lace 
his kind of group may be distingu's 
from many extant hospital activity age 
the locus of responsibility, the patient ™ 
bers themselves. in this 
The activity program evaluated in al” 
Study was designed to promote this ee 
type of social organization. The criteria CPO 


u 
for it were as follows: (a) the group shaigh 
have a defi 5 p 


i w 
anite goal or finished product eriod 
of ie be achieved in a relatively short hic e 
th time; (4) the goal should be perio’. 
that once the immediate goal is achiev” new 
ar one, but one presenting be 

takes its place; (c) there eet i 
range of demand so that P put 


allenges, 
a sufficient 
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stoup; and (e) this activity must be of end 
a nature that patients are capable of main- 
taining it with a minimum of staff interven- 
tion, particularly professional staff. After 
evaluating a number of possible programs, 
the activity chosen for study was the pro- 
duction of plays for hospital patients and 
Personnel. The characteristics of this activity 
are described in more detail under “Method.” 
It would be insufficient to describe the type 
Stoup psychotherapy studied as “ortho- 
dox” A review of pertinent literature in the 
Area reveals a rather widespread range of ap- 
Proaches to group psychotherapy with schizo- 
phtenics (eg., Bach, 1954; Grauer, 1955; 
Klapman, 1946, 1947; Kramer, 1957; Lazell, 
eh Peyman, 1956; Powdermaker & Frank, 
thi 3; Schnadt, 1955). The technique used in 
sch Study very closely resembles that de- 
a by Kramer with the possible ep 
role a less emphasis being placed upon 
atr Of interpretation by the therapist. The 
Nosphere was permissive and designed to 
ee a growing sense of “belongingness” 
y fostering a comfortable “family quality.’ 
planes the sessions primary emphasis was 
mie on nonpsychotic interactions petreen 
p Steg Interaction was encouraged and in 
Th, cited by the therapist but not demande : 
Ne therapist patterned his orientation after 
at described by Frank (1952) by being “a 
ri cacious, strong, accepting person ai 
fiente en the situation clearly for the pa 
finais d supported them in their emotio £ 
inte 01l” Any level of nonpsychotic verba 
eraction was encouraged. This included 
bela Jike the difficulty in keeping personal 
ang Sigs identified in the hospital leuna 
asses „Problem of saving enough monty À 
es. This technique might be contraste 
cidactic or pedagogical approach sug- 
d by Lazell and Klapman. 


K ae iber of authors ( Hotman, A 
Most 1957; Powdermaker & Fran A aan 


Writing 3 hotherap: 
Ser ng on group psycho 

Beng ted on the e of group Dane 
tive Y or heterogeneity. While there 15 * 


f Sagreen in this area, the consensus 
avo; nent in this area, Group 


Str Some t f heterogeneity. 

u 4 ype o Pr anne for 
Bto, “> because of its implications i- 
Iq treatment methods generally, Was 


as a main effect in this study. 


477 


It was hypothesized that significant im- 
provement in behavioral adjustment would 
occur as a result of group psychotherapy, the 
special activity program (drama group), and 
heterogeneous group structure. The analysis 
of these independent variables and of their 
interactions constitute the study reported 
here. 


DesIcN oF STUDY 


The three independent variables and their 
interactions were analyzed simultaneously by 
a 2x2xX2 factorial design, each variable 
being dichotomized. The effectiveness of group 
therapy was evaluated by contrast with a 
comparable group not receiving group ther- 
apy, the effectiveness of the drama group by 
contrast with a comparable group not in the 
special activity program, and the effective- 
ness of heterogeneity by contrast with a com- 
parable homogeneous group. Because a pa- 
tient’s original level of behavioral adjustment 
could influence the degree of change in ad- 
justment the data were adjusted for this effect 
by covariance. Additionally, it was impossible 
to insure that all patients would remain in 
the study until its conclusion. Because the 
length of exposure to the treatment pro- 
cedures could effect the degree of change in 
adjustive behavior this effect was covaried as 
well. Thus the design was a 2° factorial analy- 
sis of multiple covariance. The unique com- 
pinations of the three dichotomous variables 
resulted in eight distinct “treatment” groups. 


PROCEDURE 


Selection of Subjects and Groups 


One-hundred-thirty-four male schizophrenic pa- 
tients on a continued treatment ward of a 1,500 bed 
Veterans Administration _Neuropsychiatric Hospital 
were rated on the Multidimensional Scale for Rating 
Psychiatric Patients (Lorr, 1953). A pilot study of 
interrater reliabilities produced an average reliability 
coefficient of .85 taken over 11 ward personnel. The 
average interrater reliability coefficient for three raters 
on the interview section was 91. Coefficients in the 
total matrix ranged from .66 to .96. This level of 
reliability was considered sufficient to allow ratings 
by different raters to be considered as comparable. 
The protocols were scored for each patient and each 

rofile was compared with the hypothetical normal 
P file by means of the D statistic (Osgood & Suci, 
pro A distribution of Ds thus was generated, one 


52). £ ar A 5 
"a n the distribution reflecting maximum congru- 
e! 
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ence with normal behavior, the other end gn 
imum divergence, or pathological behavior. This 
aseibution was normalized. Subjects for the four 
E groups were chosen randomly ve pa- 
tients having T scores between +1 standard devia- 
tion. The four heterogeneous groups each ee com 
rised of two patients with T scores of less than —1 
SD two patients with T scores of greater than +1 
SD, and three patients with T scores in the mid- 
range. Based on evidence presented by Bales sand 
Borgatta (1955) and experience in group psycho- 
therapy group size was limited to seven. Each of the 
four homogeneous and _heterogencous groups were 
assigned randomly to the treatment combinations of 
group psychotherapy and the drama group. All eight 
groups had the same average level of pathology as 
measured by the Lorr scale. There were no signifi- 
cant differences between groups regarding age, dura- 
tion of hospitalization, or the taking of ataractics. 
Median age was 38.9 years. Median duration of hos- 
pitalization was 9.2 years. Fifty-three of the 56 ex- 
perimental subjects were on ataractics. 


Measures 


The principal dependent variable, behavioral ad- 
justment, was measured by the MACC Behavioral 
Adjustment Scale (Ellsworth, undated). The MACC 
produces scores entitled Motility, Affect, Coopera- 
tion, and Communication, and a Total Adjustment 
score, the sum of the Affect, Cooperation, and Com- 
munication scores. This scale has been shown to dif- 
ferentiate significantly between open and closed ward 
continued treatment patients, to be correlated sig- 


nificantly with the Hospital Adjustment scale and 

to be associated significantly with other Measures of 

improved behavioral adjustment such as the length 
of time spent on pass. The scale is short, 14 items, 
and can be rated with high reliability. Pilot study 
data on ratings by pairs of raters used in the experi- 
ment produced interrater reliabilities ranging from 
82 to .99 with an average coefficient of -92, taken 
over the 15 combinations of six rater pairs, These 
levels are consistent with reliabilities previously re- 
ported for the scale. 

Ancillary measures of grou 
cial choice were taken in the hope that the experi- 
mental groups would produce measurable changes in 
peripheral social behavior, The Semantic 
profile given by each patient on himself was com- 
pared with the average profile he gave for other 
members of his group. A D statistic was calculated 
and interpreted as a measure of cohesion; a small D 
indicating cohesion. 

Social choice data were obtained in a free choice 
situation. All patients on the experimental ward ate 
at the same time in the same area of the dining hall. 
They were seated four to a table but had complete 
freedom to choose any table in the area and any 
companions from among their fellow patients. Actual 
choice of companions at the noon meal was recorded 
for the 56 patients in the study. These choices were 

then categorized as “in group” or “out group” choices. 


P cohesiveness and so- 


Differential 
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Method 


Following the pilot study on reliability and the 
selection of groups, all subjects were rated on iam 
MACC, were given the Semantic Differential (whicl 
included their name and the names of the other 
members of their group), and were observed at their 
noon meal for 3 consecutive days and their choice 
of companions recorded. Sleeping arrangements were 
changed so that group members had adjacent beds. 
Simultaneously the experimental procedures began. 
The four groups in group psychotherapy were seen 
twice each week for 14 hours; a total of 3 hours a 
week. All groups were scen by the same therapist, 
the senior author. 

The four drama groups were formed into two 
groups of 14, one homogeneous and one heterogene- 
ous. In each group half of the patients were in group 
psychotherapy and half were not. These groups met 
three times a week for an hour. Generally they met 
in the Recreation Hall with a staff moderator, a 
recreational therapist from Special Services. This 
moderator had been instructed to supply the groups 
with all the material and information requested or 
needed by them for the production of plays, but to 
avoid taking over the “leadership” of the group, The 
role of the moderator might best be characterized aS 
a “nondirective resource person.” This role proved 
to be a difficult one to assume and was maintained 
only by frequent consultations between the experi- 
menters and the moderator, Difficulties appeared to 
stem from the moderator’s identification with the 
group himself. Thus, when a group once decided to 
put on a play reading from the scripts, the moderator 
became Personally concerned over the adequacy of 
their decision. The moderator was present for all of 
the carlier meetings of the groups but missed some 
as time went on and occasional conflicting duties 
Prevented his attendance. On those occasions the 
groups met without him, At the beginning of the 
study all patients in the drama groups were told in- 
dividually that they had been chosen for a detail to 
Provide plays for the entertainment of the staff and 
fellow patients. They were “assigned,” not given 4 
choice. Some patients protested leaving present de- 
tails or simply engaging in an activity for which they 
did not care. Most patients, however, accepted the 
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taken every 
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leaving the hospital within 2 months after the be- 
ginning of the study was to be replaced by another 
randomly selected subject. Patients who left after 
longer than 2 months were counted as subjects but 
were replaced in groups by another “equivalent” pa- 
tient. No data were collected on these “replacements” 
which were used only to maintain the groups at 
their full strength. 


RESULTS 


Because the primary dependent variable, 
the MACC, consisted of four subscores and a 
Summary score, five separate analyses were 
done. In each case the analysis of the final 
Scores was adjusted by multiple covariance 
for the effect of initial level and the length of 
time the subject was in the study. None of the 
Main effects or interactions reached signifi- 
Cance at the .05 level for the Affect subscores. 
The activity effect, however, with an F of 
3.969, narrowly missed significance at this 
level. The activity effect was significant for 
all other subscores and for the total adjust- 
Ment scores: motility, p < .05; communica- 
tion, p < .01; cooperation, p < .01; and to- 
tal adjustment, p < .01. Group psychotherapy 
reached significance at the .05 level for the 
motility subscores, but was nonsignificant for 
the other subscores and the total adjustment 
Scores. The group structure effect did not 
teach significance on any of the measures. All 
interactions were nonsignificant. 

Analysis of the Semantic Differential dis- 
tance measures between self-rating and the 
average rating of other group members re- 
Vealed no diffetence between original and final 
Measures that could be attributed to any 
treatment or treatment combination. This 
Measure produced very high attrition because 
°f blank, incomplete, or obviously invalid 
Protocols. It is interesting to note, however, 

at all distance measures used decreased 
Over time and that this difference was signifi- 
“ant at the .02 level. ? 

Changes in choice of luncheon companions 

om outgroup to ingroup were practically nil. 
cig’ social choices showed ae oan 
Se over time and no significan Baan 
ro S Were obtained, either between trea j 
or DS and combinations of treatment a 
Nien original and final choices over a 

S. 
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DISCUSSION AND CONCLUSIONS 


iGo result of this study is 
tency with which the activity group 
showed significant change on the various 
categories of the MACC Behavioral Adjust- 
ment Scale. Changes were significant on all 
but the Affect subscale where the F missed 
significance at the .05 level by a value of 
only .08. These changes uniformly were in 
the direction of improved behavioral adjust- 
ment. The significant change in the Motility 
subscale reflected a lessening of motility. The 
data suggest this was a decrease in behavioral 
agitation and restlessness, Group psychother- 
apy showed a significant decrease in motility 
as well, but the data did not reach significance 
on the other subscales or on the Total Ad- 
justment score. No significant results were at- 
tributable to the homogeneity-heterogeneity 
variable. During the study 18 patients left on 
trial visit or discharge, 2 of which returned 
within 6 months. No group or treatment 
showed a significant difference in this regard. 
The fact that the significant differences 
found on the MACC for the activity group 
were not found in the Semantic Differential 
and social choice data for the same group re- 
inforces earlier questions about these meas- 
ures. A satisfactory method for screening in- 
valid Semantic Differential protocols was not 
found. While some protocols were obviously 
invalid, e-g., those showing an invariant re- 
sponse pattern on the test form, in many cases 
this judgment was difficult to make. When the 
validity of a protocol remained in question it 
was accepted as data and treated as valid. 
This is an arbitrary procedure at best. Reli- 
abilities on this instrument, using only the 
“valid” protocols, calculated from immediate 
test-retest by replicated items in the test form, 
ranged from —.25 to .96 with an average of 
.58. The overall change from pre- to posttest, 
if interpretable at all, most likely reflects a 
change in therapeutic procedures on the ward 
which occurred simultaneously but independ- 
ently of the study. Overall rates of leaves, 
trial visits, and discharges also increased. 
Although choice of luncheon companions 
was intended to be a measure of the forma- 
tion of “real” groups resulting from the arti- 
ficial experimental groupings, it became ob- 
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vious that this behavior was Seep ieee 
and insensitive to change. Use of a be ously 
as yhich did not have a previously 
E ali vould have been advan- 
stereotyped pattern woulc 
oe ieee of the study are encouraging. 
While only one treatment produced yee 
changes, it did so with compelling Pegi 
and consistency. The activity variable was 
responsible for most of the change in behav- 
ioral adjustment that occurred. It should be 
pointed out, however, that an important dif- 
ference existed between the therapy and the 
drama groups in addition to the differences 
in “treatment.” The group psychotherapist 
was a different person from the resource per- 
son associated with the drama groups. Thus, 
it is possible that the results document differ- 
ences between the skills of these two people 
rather than between treatments as such. While 
this interpretation is possible it does not seem 
most parsimonious to the authors, This prob- 
lem was evaluated when the study was de- 
signed and there appeared no feasible way of 
separating person effect from treatment effect 
and maintaining an adequate design, Addi- 
tionally, the results favor the activity effect— 
a treatment wherein the resource person had 
only minimal contact, The amount and nature 
of contact with the subj ts was specified as 
carefully e the study began 
and every effort was made to insure they were 
maintained as such, Thus this Problem does 
not affect the interpretati 


all significance in 
any event. One could i 
cance of the results 
group psychotherapy effect, h 
possible that a more skilled therapist, using 
the same procedures, might have produced 
more significant results, It Was decided to 
spell out the group Psychotherapy Procedures 
as clearly as possible and have all groups 
seen by the same therapist to avoid confound- 
ing intertherapist differences, Because the 
drama activity had its ow 


n discrete charac- 
teristics, in addition to the criteria specified 
in the study, it is impossible to state with 


certainty the source of the significant differ- 
ences. It is clear, however, that the activity 
studied produced significant results in the 
predicted direction and there is reason to ex- 


areas for the 
owever, It js 
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pect that it would do so again at another time 
or in another place. nouit be 

This latter finding taken alone Baue 
of significance to those in mental eee 
concerned with the treatment of pee 
schizophrenics. The activity program a si 
here produced consistent and a si 
havioral change with a minimum ees 
tervention and expenditure of time. The ead 
sonnel efficiency” of such a treatment me ater 
is unquestionably of value. Of much on ‘as 
significance, however, is the fact that 8 in 
expensive technique produces results whit nie 
this study, were incomparably better Sper 
more “expensive” and time consuming Fined 
Psychotherapy requiring a highly a zót 
therapist. The implications of this study ane 
the systematic use of nonprofessional a ee 
nel in the active treatment of chronic sc ae 
phrenics are Compelling, as well as being 4 
tractive, 

A number of 
selves for future 
need to vary 
holding basic 


refinements present ae 
study. There is Lager eer 
activity programs by con to 
selection criteria constant, 
evaluate the generality of the Suen: ac- 
would be expected, of course, that any the 
tivity program constructed to conform to one 
basic criteria and administered as the lent 
Currently studied would produce equiva pist 
results. The difficult problem of eg 
“effects” in Sroup psychotherapy requires ity- 
ther attention, Although the homogene! 3 
cterogeneity variable did not produce iy 
nificant Tesults as it was studied, it is li ed. 
that the method of study could be wen 
In this Study it was advantageous to ate 
the central tendency in the two types of a er- 
structure equivalent, varying only the disper 
sion. It is likely, however, that an ae o 
to group structure may interact with lév aots 
Pathology, Thus a study varying both ° jata 
SYstematically would provide informative 
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Of initial level of behavioral adjustment and 
length of stay in the therapeutic program. 
The primary dependent variable was the 
MACC Behavioral Adjustment Scale and its 
subscales. Measures of group cohesion and 
Social choice also were obtained. The activity 
Variable produced significant and consistent 
results in the predicted direction, Group psy- 
chotherapy produced relatively minor positive 
results and the group structure variable pro- 
duced none. None of the interactions were sig- 
nificant. The ancillary measures of group co- 
hesion and social choice showed no system- 
atic change. The implications of this study 
for the use of this kind of activity program 
involving nonprofessional personnel in the 
treatment of chronic schizophrenic patients 
are positive and compelling. Refinements in 
design and suggestions for future research 


Were presented. 
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THE USE OF AN EXTENDED DRAW-A-PERSON 
TEST TO IDENTIFY HOMOSEXUAL AND 
EFFEMINATE MEN! 


LEIGHTON WHITAKER, Jr2 
Wayne County General Hospital, Michigan 


Classical psychoanalytic theory holds that 
individuals may have a psychological identi- 
fication with the same sex or with the opposite 
sex (Fenichel, 1945). The concept of psy- 
chosexual identity” has been used in projec- 
tive testing of personality also. Regarding the 
Draw-A-Person Test (the DAP), Machover 
(1949, p. 101) has stated: 


From the standpoint of sexual identification, it is as- 
sumed to be most normal to draw the self-sex first, 
From an empirical point of view, it is of interest 
that evidence of some degree of sexual inversion was 
contained in the records of all individuals who drew 


the opposite sex first in response to the instruction, 
“draw a person.” 


Presumably what is cruci 
is that the individual is 
sex of the person dra 
his own psychosexual identity. 


sen (1955) haye 
efficiencies of such 
be evaluated rela- 
e characteristics, 


1 Paper presented at the Michigan Academy of 
Arts, Sciences, and Letters, Psychology Division, Pee. 
sonality and Clinical Section, at Ann Arbor, Michi- 
gan, March 1960, 

? Data for this research were colle 
author was at Recorder’s Court Psych 
Detroit, with the kind permission 
Executive Director, A 


» for their assist- 
and end phases 
respectively. 
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METHOD 


hJ 
Two hundred and thirty-six men aged ‘ae 
with an average age of 28, who were referred sasibs 
clinical Psychologists in a court clinic, served spation 
jects. Each subject was first given an mn istory 
which, at the minimum, consisted of a Me aed 
interview and the Verbal section of the ae 
Adult Intelligence Scale. Judgments were then viel 
by the same examining psychologist, on ma was 
of the examination, as to whether the subjec was 
homosexual or effeminate, Finally, the ae 
given the extended DAP by the author. No > few 
was made to select subjects and, with relatively 


ists 
exceptions, all men referred to the two psychology 
during a period 
search, 


Selection 


sed in the 
of 7 months were used in 


a sub- 
, of the criteria according to wich esente 
Ject was judged homosexual or effeminate a 
three particular Problems, First, the meaning O 


i 18 ee tduals 
terms varies considerably according to the indiv. 


r 3 terms 
using them. It was necessary to give thest terioD 
highly explicit definitions so as to allow high ¢ 


reliability, Second, 
Judgments were bas 
and extent available 


overcome this problem, Third, since not all 
Sexual me 


h n are effeminate according to psy 
lytic theory, separate judgments of porros 
and effeminacy had to be made. However, have 
Fees that nosi. ‘homosexual wien do xpecte 
feminine Psychosexual identity and it was pet this 
that as a roup, therefore, they would progs of 
Personality feature into their free choice draw! 
the DA) 


i ; hic! 
the information on w 
ed varies somewhat in 


A founi 
- No special means was jhomo- 


hoana” 
ality 
was 


kind 


the 
meant admission ee ore 
examining psychologist of pee Jor be- 
Instances of manifestly sexual impulses an n oth 
havior toward a person of the same sex whe jna e 
parties were Past pubertal age. The label a este of 
meant one or more of the following: (@) pet dur 
effeminate Speech, gestures, mannerisms, or oF the 
ing the examination; (b) personal descriptio? jfestlY 
Subject to the Psychologist of playing a M% ost 0 
feminine tole where, for 2 or more years, sews 
the subject's activity consisted of housewor erson, 
mg, washing, ironing, infant care, ete.3 (¢) p yer) 
description DY the subject to the psychologist 5 
strong interests in typically feminine activ? 


The label “homosexual” 
Subject to the 


gigs 


AP 


DISCRIMINATION OF HOMOSEXUAL AND/OR EFFEMINATE MEN By DRAW-A-PERSON SIGN 
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Homosexual Effeminate H and/or E 
Regular DAP 
Test Test Test 
a = ae = Gh 


Identification of Homosexual and Effeminate Men 
TABLE 1 
r} 


45 +[16 9] 25 +] 33 26] 59 
d Interview 191 — 1/175 36| 21 — |158 19] 177 
- 
191 45 236 191 45 ó 191 45 236 
“ye = 19, p < 001 x$ = 5, p < 001 x = 32, p < 001 
i Extended DAP-A 
Test Test Test 
- + = $ - + 
? +o 36] 45 a | a m s +j 47] 50 
f 
Interview — [14 47 191 — |149 62| 211 — |141 36| 177 
153 83 236 153 83 236 153 83 236 
| — 
y x? = 49, p < .001 x? = 27, p < 001 x? = 67, p < 001 
E Extended DAP-B 
g Test Test Test 
- + ee - + 
45 25 +] 17 42] s 
Interview 191 211 — |156 21| 177 
i 173 63 236 173 63 236 173 63 236 
— ž Py N Èm 
2a 52, p < 001 x? = 30, p < .001 x? = 80, p < .001 
t -S x a 
A Extended DAP-C 
Test Test 
- + - + 
+] 18 8 25 + | 38 21 59 
—|193 17] 211 = (473 al) aa 
21 25 236 211 25 26 


xX = 13, p< 001 


x? = 52, p < 001 
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subject was rated homosexual or 

Be = ate z A was unequivocal. 
Eor sychologists who examined and rated the 
te ee had at least 3 years training and/or 
vst beyond the master’s degree in clinical psy- 
om To provide an estimate of the reliability of 
oa s, each psychologist independently Tated 
= F P for homosexuality and effeminacy. 
— vas complete agreement in rating homosexual- 
legen agreement in 38 of the 40 cases in rating 
a gi version of the DAP was adminis- 
tered by the author as follows: Each man was told 
“draw a person.” When the sex of the figure drawn 
was decided by the man, he was told “N 
person of the opposite sex.” Finally, 
was obtained where, 


ow draw a 


test pro- 
the sub- 


second free choice js female. 
tive if both free choices are female 


RESULTS AnD Discussion 

As shown in Table 1, al 
scoring psychometric sign 
characteristics homosexual, 
homosexual and/or effeminat. 


l four methods of 
discriminated the 
effeminate, and 
e beyond chance 


PREDICTIVE EFFICIENCIES OF THE 


PSYCHOMETRIC Sicn 
CORRECT DECISIONS FOR INDIVIDUALS 


Leighton Whitaker 


levels. This aspect of the results supports the 
theoretical expectation, based upon psycho- 
analytic and projective test concepts of psy- 
chosexual identity, that psychosexual identity 
is projected into the choice of sex of the af 
ures drawn in free choice drawings on the 
DAP. o 

As shown in Table 2, when the efficiencies 
of the various signs are compared with the 
efficiencies of classifying everyone simply as 
not possessing the characteristic in question, 
it is clear that the signs have no appreciable 
value except for predicting the characteristic 
homosexual and/or effeminate. Even here the 


o 
: . = 4/0 
Improvement is most modest, at best hg 
vs. 75%. However. differential weighting 


F 3 : er 
false positives ys. false negatives might alt 


conclusions, dependent upon which error we 
judged worse, For example, if it were of pir 
mary importance to screen out all men with 
the characteristics and of only secondary im- 
portance to avoid screening out some men 
without the characteristics, then the psycho: 
metric signs May be of practical use. As aan 
in Table 3, the extended DAP with scoring 
Method A Screened 80%, 84%, and 80% 0 
the men with the characteristics homosexual, 
effeminate, and homosexual and/or effeminate, 
respectively. At the same time 57%, 75705 
and 43% of the men without these respective 
characteristics Were screened out. 

In view of the potential usefulness of the 
extended DAP as a screening device, it 15 
Suggested that the test be tried in other set- 
tings where cross-validation data could be ob- 


TABLE 2 


S AND Base R 


ATES IN TERMS OF 
WITHIN: THE G 


ROUPS 
(V = 236) 
z R n SEON n —— a 
Homosexu ffont E è 
aa _Effeminate H and/or Bs 
Correct decisions N % N D N % 

: aal n a = /o F; a — 
TISELE 191 81 211 Bor s 75 
Regular DAP 184 78 184 78 ier st 
Extended DAP-A 180 16 170 72 8 80 
Extended DAP B 194 82 184 78 ie st 
Extended DAP ( WS gy 201 85 195 93 

Base frequeney for Homose 25; Hg er ie 
d percentage of correct decision din tHe more prana careno ies, nomeri a © 


oa 2 


OE a 
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TABLE 3 
MEN WITH THE CHARACTERISTICS WHO ARE POSITIVE IN PSYCHOMETRIC SIGN 

(N = 236) 
Homosexual Effeminate H and/or E 
Test N % N % xv % 
Regular DAP 20 44 8 32 27 46 
Extended DAP-A 36 80 21 84 47 80 
Extended DAP-B 33 73 18 72 42 71 
Extended DAP-C 16 36 7 28 21 36 


tained, Perhaps the greatest obstacle to ob- 
taining such data will be the difficulty, char- 
acteristic of attempts to establish the validity 
of projective techniques, in finding adequate 
Criterion measures. Some observations in the 
Present research suggest that more adequate 
Criterion measures would have shown the ex- 
tended DAP to be a more powerful discrimi- 
nator of psychosexual identity than the tables 
of results indicate. For example, of the nine 
homosexual men who were not positive on the 
extended DAP with Scoring Method A five 
limited their homosexual activity to playing 
the “masculine” role and would not, therefore, 
be expected to have a feminine psychosexual 
identity according to psychoanalytic theory 


(Fenichel, 1945). 
SUMMARY 


Two hundred and thirty-six men, referred 
to a court clinic, were rated on the charac- 
teristics homosexuality and effeminacy by a 
clinica] psychologist on the basis of life-his- 
tory interviews which he conducted. Each man 
Was then given an extended Draw-A-Person 

est on which he chose the sex of two of the 
three figures drawn. All four possible meth- 
°ds of scoring a test protocol as “positive” in 


psychometric sign (one or more free choice 
drawings of a female) were used to predict 
the characteristics. The results support the 
theoretical expectation, based on psychoana- 
lytic and projective test concepts of psycho- 
sexual identity, that psychosexual identity is 
projected into free choice drawings. The psy- 
chometric signs were not more efficient, over- 
all, than the base rates in predicting the char- 
acteristics. However, differential weighting of 
false positives vs. false negatives might alter 
conclusions about the practical usefulness of 
the signs, dependent upon which error was 
judged worse. It was suggested that the ex- 
tended Draw-A-Person Test be used in other 
settings where cross-validation data could be 
obtained. 
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THE PSYCHOLOGICAL SIGNIFICANCE OF THE MMPI 
K SCALE IN A NORMAL POPULATION 


ALFRED B. HEILBRUN, Jr. 


Slate University of Iowa 


The long-standing assumption that the K 
scale is a measure of defensiveness stems di- 
rectly from the nature of its derivation—the 
detection of hospitalized psychiatric patients 
who presented normal profiles on the MMPI. 
However, two lines of evidence have sug- 
gested that K scale scores may also relate to 
general level of adjustment. For one, Wheeler, 
Little, and Lehner (1951) found that normal 
groups scored higher on the K scale than ab- 
normal groups, and their interpretation of 
these findings was consistent with 
clinical practice which has for a nu 
years stressed K scale elevations as 


of ego adequacy. The other line of evidence is 
the rather consi: 


general 
mber of 


ment (Carp, 1950; Feldman, 1 
1953; Hales & Simon, 1948; S 

More recently, Smith (195: 
evidence which suggests the 
adequate measure of defensi 


952; Gallagher, 
chofield, 1953), 


9) has provided 
K scale is not an 


ae results of his 


defensive for ab- 
normal population Ss to obtain high K scale 


scores but a sign of health for normal popu- 
lation Ss” (p. 276). 

If true, the implications of 
psychological meaning for K sc 
abnormal and normal populations are serious, 
The original addition of a K increment to 
five of the nine MMPI clinical scales followed 
a demonstrated enhancement of discrimina- 
tion between largely inpatient psychoneurotic 
and psychotic groups and a general Minnesota 
‘normal” group. However, extension of this 


a differential 
ale scores for 


weighting system to personality measurement 
within a normal population assumes that (@) 
K measures defensiveness, (b) defensiveness 
is associated with a lowering of MMPI scale 
Scores, and (c) the correction by adding a K 
increment raises these scale scores and pro- 
vides a more veridical assessment of the ne 
son. However, if K is positively correlate 

with psychological adjustment for normal sub- 
jects and is not a measure of defensiveness; 
the K correction would appear to be operat- 
ing in direct opposition to test validity. That 
is, higher K scores would tend to be associ- 
ated with better adjustment for normal sub- 
jects; yet the higher the K scale score, the 
steater the K correction and the more the 
elevation of the clinical scales in the psycho- 
pathological direction, The problem of appre” 
priate K usage in normal population testing 
was foreseen by the original workers (Mc- 


Kinley, Hathaway, & Meehl, 1948) with the 
MMPI who stated: 


For other clinical Purposes it is possible that oe 
\-values lie, K Weights] would be more appropy i 
ate. Thus, it seems likely that for the best separatio 
of “maladjusted normals,” such as those we 
abound in a college counseling bureau...» Ot 
weights might be better (p. 24), 


. y 
Results such as those provided by Smith's 
Study which raise questions about the appt” 
Priateness of standard MMPI usage in nir 
mal Populations warrant careful scrutiny. ~- 
is the purpose of the present study to ovali 
ate two hypotheses suggested by Smith. Thes 
are: ‘ 
1. The K scale is a measure of psycholot™ 
(adjustment) in a normal pop" 


n- 
2. The K Scale is not a measure of defe 


Siveness in q Normal population. 
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Significance of MMPI K Scale 


MerruHop 


Definition of Differing Adjustment Groups 
within a Normal Population 


To test the hypothesis that K scale performance is 
Positively related to level of psychological health 
within a normal population, two types of groups 
from a normal college population clearly differing 
in adjustment level were selected. A group of col- 
lege males (NV = 146) who had sought help at the 
University Counseling Service, State University of 
Towa, constituted a poorer adjusted normal (PA) 
Stroup. This number included 100 subjects who had 
requested vocational and/or educational counseling 
and 46 subjects who sought counseling for personal 
adjustment problems, The male better adjusted (BA) 
group was comprised of 153 students, none of whom 
had been scen in the Counseling Service. 

Parallel female PA and BA groups were also con- 
Stituted. The female PA group included 143 Coun- 
seling Service clients, 100 having requested help for 
vocational and/or educational problems and the re- 
maining 43 for personal adjustment problems, The 
female BA subjects (W =197) were students who 
had not requested help at the Counseling Service. 
Both the male and female PA groups approximate 
representative samples of Counseling Service clients 
as far as the proportions of vocational-educational 
and personal adjustment counseling rquests are con- 
cerned. 

Each of the 639 subjects included in the male and 
female BA and PA groups had taken the MMPI un- 
der one of two conditions: (a) as part of the uni- 
versity prefreshman entrance battery in the summer 
of 1958 or 1959, or (b) as part of the Counseling 
Service intake battery. All of the BA subjects took 
the MMPI as part of the prefreshman battery, and 
about 90% of the subjects in the PA groups did. 
Thus, the K scale scores for the four adjustment 
8roups are provided by very homogeneous samples 
relative to both age and educational level at time 
of testing. Since there is some evidence (Sarason, 
1956) that K scores are related to intellectual abiiy 
in college subjects, separate preliminary analyses ° 
this relationship for the males and females in the 
Present samples were conducted. The product-mo- 
Ment correlation between K and the mean composite 
Percentile on the university entrance examination for 

oth males (N=319) and for females (N = 373) 
Was .11, significant at the .05 level. Although this 
figure suggests a relationship of limited are 

© four adjustment groups were matched for abl ity 
evel, The group composite percentile means wa 
Male PA=53.91, male BA=53.72, female P: 
T 54.47, female BA = 54.60. , 

Based on Smith’s hypothesis that the K scale = 

"es psychological health in a normal population, 1 

aS predicted that the BA male and female iets 

d score higher on the K scale than would the 

op ETOups, level of adjustment being defined in aan 

Dr “oliciting or not soliciting help for psychologic: 
°blems subsequent to testing. 
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Definition of Defensiveness 


For purposes of his investigation, Smith used the 
definition of defensiveness provided by Page and 
Markowitz (1955) as follows: “The defensive indi- 
vidual is described as one who fails to ascribe to 
himself characteristics of a generally valid but so- 
cially unacceptable nature” (p. 431). In the present 
study the concept of defensiveness was extended so 
as to include the self-ascription of characteristics 
which are not valid but are socially acceptable as 
well as the denial of valid but unacceptable charac- 
teristics. 

Since adjustive behaviors are much more likely to 
be socially acceptable than nonadjustive behaviors, 
one difficult aspect of deriving a useful measure of 
defensiveness in a normal population is the increased 
probability that a socially acceptable self-description 
is also factually correct. In the present study this 
problem of confounding defensiveness and accurate 
self-description was approached by using the self- 
descriptions of a group of subjects at the malad- 
justed end of a normal population adjustment range, 
Self-descriptions on the 300-item Adjective Check List 
(ACL) (Gough, 1960) of 50 male college students 
who sought help for personal adjustment problems 
at the Counseling Service were scored for the num- 
ber of endorsed adjectives which were included in 
the 75 judged to reflect most favorably on the en- 
dorser and the number included in the 75 judged to 
reflect least favorably on the endorser (see Gough, 
1955). Subtracting the latter count from the former 
provided a “favorability” count for each subject. 
By cutting the distribution of favorability scores at 
the median, a group of subjects giving more favor- 
able self-descriptions and a group giving less favor- 
able self-descriptions were obtained. Since all sub- 
jects were maladjusted it was assumed that the 
subjects giving more favorable self-descriptions rep- 
resented a more defensive group. A “Defensiveness” 
(Def) scale was then developed for the ACL by de- 
termining through chi square analysis which adjec- 
tives out of the 300 total reliably discriminated be- 
tween the high and low favorability subjects. The 
61 adjectives which discriminated were cross-vali- 
dated on a new sample of 34 male personal adjust- 
ment counseling subjects, and 28 of these adjectives 
significantly differentiated the newly constituted high 
and low favorability groups. These 28 adjectives 1 
(composed of 27 favorable adjectives and one un- 
favorable adjective which became a substractive 
item) were included in the male Def scale. It can 
be noted that the mode of derivation for the male 
Def scale (as well as the female scale described be- 
low) parallels that of the K scale in the inclusion of 
items which discriminated between maladjusted sub- 
jects who portrayed themselves psychometrically in 

1The lists of adjectives included in the male and 
female Defensiveness scales may be obtained with- 
out charge from Alfred B. Heilbrun, Jr.; Depart- 
ment of Psychology, State University of Iowa; Iowa 


City, Iowa. 
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TABLE 1 


sp SD: CORRECT NSIVENESS SCALES SCORED ON 
q ED DEFENSIVENESS SCALES SCORED 0 

NS oF MALE AND FEMALE 

MEANS AN S 


ACLs 


OBTAINED UNDER STANDARD, DEFENSIVE, AND IpEAL-SELF CONDITIONS 
B E A 


Male 


Female 
Testing condition N mean SD N mean SD 
7 5 5 5 5.34 
dard 97 15.30 4.58 114 16.85 s 
evens 30 18.63 3.92 11 22.18 4.23 
Tdeal-self 56 21.00 79 25 24.32 2.15 


an unduely favorable light and those subjects who 
Doren. found that for a sample of 97 normal 
male students the Def score was positively cor- 
related with total number of adjectives checked 
(r= .71), so a correction was necessary. The mean 
number of endorsed adjectives for these subjects 
was 95.16 with an SD = 30.08, while the SD for 
the Def scale was 3.28. A correction of one point 
(about one-third SD) on the subjects Def scale 
score was made for each 10 endorsed adjective (about 
one-third SD) deviations of total checked from the 


al number was 
if the total fell 
ere counted as a 
removed the con- 
adjective endorse- 
‘ant correlation of 


» based on 


Counseling Service subjects consisted of 
while the replication sample in 
Seventy-two adjectives re] 
tween high and low 


43 females 
cluded 55 subjects, 
discriminated be- 


(r= 66) between 
female Def scale scores and number of adjectives 


checked was found for a sample of 114 college fe- 
males. Correction was made based on the follow- 
ing statistics: mean number of adjectives checked 
= 92.63; SD of total adjectives checked = 30.75; and 
SD of Def scale = 6.97. It was found that using the 
adding or subtracting of two Def scale points (about 
one-third SD) for each 10 adjectves checked (about 
one-third SD) above or below 93 adjectives (with 
partial credits for parts of 10) overcorrected and 
produced a negative correlation of — .64 between 
Def scores and total adjectives. The same correction 
used with the male scale was then applied and quite 


successfully eliminated the relationship ea 
between Def scores and total adjectives checke d 
based on the performances of the 103 additional na 
male college subjects. The 10 week test-retest PA 
ability of the Def scale for females (N = 56) is pi : 

Some preliminary evidence was available to evalu 
ate the male and female Def scales as measures > 
defensiveness, These scales were scored on ACLS * e 
ministered to normal college subjects under three 
conditions: (a) a standard instruction research ee 
dition; (b) a standard instruction ditoniiweni i 
inducing condition (Heilbrun, 1958); and (c) ee 
ideal-self-description instruction condition (Heilbr un, 
1958). If Def scales are measures of defensivenc® 
scores mght be expected to show a progressive is 
crease over these three conditions, Table 1 shows us 
expected progressive mean increase as well as an ee 
creasing homogeneity of scores for Def scores. Te 
of significance showed the difference in standard ae 
defensive condition means for males Gaye df; 
123 df; $< 01) and females (t= 3.80 for oe 
P<.001) to be highly significant. Various consider” 
tions (eg, overlapping subjects, heterogeneous Vé en 
ances) made testing for mean differences hegre 
defensive and ideal-self conditions unfeasible. ae 
to evaluate whether the Def scales may be elo 
differences jn adjustment level rather than detenen 
ness, the Def scale means for groups of more Lene 
adjusted Counseling Service males (15.26; SD = 103) 
W = 109) and females (1625; SD = 6.08; N= mal 
to the scale means of the Popat 
Ondition”) male (15.30) and or 
reported in Table 1. The 4 not 
indicate that Def scores are 
Measures of adjustment level. pef 

Despite the preliminary evidence that the for 
scales do measure defensiveness in self-descriptio” hat 
normal college population subjects, it is clear iss 
Scores on these scales still confound defensiva ofS 
and true adjustment level (ie, some high Sen are 
are truly well adjusted and some low scorers f; 
truly maladjusted), The most that is contender ri- 

at proportionally more of the performance Xien- 
ance on the Def Scales can be attributed to o OF 
gta. than would be the case for performant ing 
the entire ACL, the Proportions in either case 


Unspecifiable, ae 
I p - 7 S 
© test Smith’s hypothesis that the K scale } fie 
a meas 


| Ja 
ure of defensiveness in a normal pop” 


See 
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Significance of MMPI K Scale 


two correlational analyses were conducted: (a) the 
K scores of 103 Counseling Service female and 109 
Counseling Service male college students were cor- 
related with their Def Scores on the ACL, and (b) 
the K scores of 141 normal college female and 92 
normal college male subjects were correlated with 
their ACL Def scores. ACLs for all subjects in the 
Counseling Service groups were administered as part 
of the Counseling Service intake battery at a vari- 
able time (from a few moments to almost 2 years) 
following the administration of the MMPI. The nor- 
mal subjects were given the ACL under research 
Conditions from 1 to 18 months after taking the 


MMPI, 
RESULTS AND DISCUSSION 


Hypothesis I: The K scale is a measure of 
Psychological health in a normal population 


The mean K standard score for the male 
PA group was 55.62 (SD = 8.55) compared 
to a mean K standard score of 54.18 (SD 
= 8.35) for the BA male subjects. These 
mean values do not differ significantly from 
each other (f= 1.43 for 299 df; .15< p 
< .20). The female PA group had a mean K 
score of 56.83 (SD = 7.25) whereas female 
BA subjects had a mean K score of 58.02 
(SD = 7.02). Again the difference in means 
did not differ reliably from zero (¢ = 1.51 for 
338 df; .10 < p < .15). Thus, there is no sup- 
port in these data for the contention that the 
K scale measures degree of psychological ad- 
justment in a normal population. However, 
since each of the PA groups was composed of 
two subsets of subjects differing in level of 
adjustment (i.e., better adjusted vocational- 
educational cases vs. more poorly adjusted 
Personal adjustment cases), it remained a 
Possibility that differences in K might be 
demonstrated if more extreme comparisons 
Were made. Accordingly, the mean K scores 
‘or the personal adjustment counseling sub- 
Jects and the BA subjects were compared. For 
Males, the mean K score for the personal ad- 
JUstment subjects (V = 46) was 54.24 (SD 
= 8.62), whereas this mean score for the BA 
Subjects (M = 153) was an almost identical 
54.18 (SD = 8.35). For females, the mean K 
Score for the personal adjustment subjects (V 
= 43) was 55.23 (SD = 6.35) and the K 
Scale mean for the BA subjects (V = 197) 
Was 58.02 (SD = 7.02). This difference was 
“nificant at the .05 level of confidence (¢ 
52.36 for 238 df). Thus, there is some evi- 
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dence that K is positively related to level of 
psychological adjustment for females when 
extreme groups are compared but no evidence 
for this relationship with males. 

Since all PA subjects were Counseling Serv- 
ice clients who had solicited help, one pos- 
sible bias in these K by adjustment level 
analyses is that such “help-seekers” would 
also tend to be uncritically frank endorsers of 
pathology (i.e., “plus getters”) on the MMPI. 
Since plus-getting should be associated with 
lowered K scores, such a bias would operate 
in the direction of supporting the hypothesis 
that the more poorly adjusted subjects in this 
study would have lower K scores than nor- 
mal subjects who had not sought psychologi- 
cal help. There does not appear to be any way 
to analyze the possible effect of plus getting 
within the current data, although it can be 
noted that Hypothesis I failed to receive any 
support in the male analysis despite this pos- 
sible bias effect. It might be added that the 
lowered circumspection implicit in plus get- 
ting behavior can actually be considered a 
part of the true pathology of subjects, repre- 
senting as it does a marked lowering of the 
ego defenses. 


Hypothesis II: The K scale is not a measure 
of defensiveness in a normal population 


The product-moment correlation between K 
standard scores and scores on the Def scale 
was .43 for the 109 male Counseling Service 
subjects. Considering the only moderate test- 
retest reliabilities of the two scales? and the 
fact that considerable time typically elapsed 
between the two tests, a correction for at- 
tenuation was applied and this correlation be- 
came .64. Both correlations are significant be- 
yond the .01 level of confidence. For the 103 
Counseling Service female subjects, the cor- 
relation between K and Def was .26. After 
correction for attenuation this correlation was 
.35, both figures being significant beyond the 
.01 level of confidence. The correlation be- 
tween K and Def scale scores for normal col- 
lege males (V = 92) was .24 (p < .05) or 35 
(p < .01) corrected for attenuation. This cor- 


2 A test-retest reliability figure of .70 was used for 
the K scale in all corrections for attenuation. This 
figure has been suggested as a best estimate (Dahl- 
strom & Welsh, 1960, p. 53). 


490 


5 .01) for normal col- 
relation was — 25 ae S Ji 36 (p < 01) 
lege females (N = 
following e a orrelations suggest two 

a oy the K scale appears to be 
pices, RE Ei of defensiveness for malad- 
Ea beret ee ts a normal population than 
es adjusted subjects in a a 
population. The decrease in the tp sie rela- 
tionship between the K scale and the 2 

ures of defensiveness comparing the correla- 
tion for the male maladjusted group to that 
for the male adjusted group (.29) and the 
correlation for the female maladjusted group 
to that for the female adjusted group (.71) 
were both significant (p < .05 and .001, Te- 
spectively). This finding is generally consist- 
ent with Smith’s (1959) argument that “it is 
defensive for abnormal population Ss to ob- 
tain high K scale scores but a sign of health 
for normal population Ss” (p, 276), if the 
reasoning is extended to maladjusted vs. ad- 
justed subjects with a normal population. The 
second implication of these correlational data 
is that a sex difference in the psychological 
meaning of K scale performance may exist. 

In the analyses of maladjusted normal groups, 

the females provided a significantly positive 

correlation between K scale performance and 

a measure of defensiveness but one that was 

reliably lower (p< .05) than that for the 

males. When adjusted normal groups were 
analyzed, males continued to show a signifi- 

cant positive relationship between their K 

scale scores and Def scale Scores, whereas fe- 

males showed the opposite pattern—the higher 

the K scale score tended to be, the lower the 

defensiveness. The reversal for adjusted col- 

lege females of the usual psychological sig- 

nificance attributed to the K scale was also 

found by Smith in his predominantly male 
group of industrial supervisors. This reversal, 
taken in conjunction with the finding that 
adjusted females showed significantly higher 
K scores than more seriously maladjusted fe- 
males, suggests that both of Smith’s hypothe- 
ses received considerably more support from 
the female data than from the male. 

In conclusion, the data from the present 
study indicate that the K scale is positively 
related to defensiveness when more malad- 
justed subjects from a normal college popula- 
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tion are appraised but is less positively re- 
lated or even negatively related to aero, 
ness when psychologically healthy ae 
are considered, The alternative ie peng 
implication of K for normal population ia 
jects suggested by Smith—degree of Pe 
logical health—received some support in be 
case of females but none in the case of ma pe 
These results suggest that the standard K cor 
rection for the MMPI clinical scales is psy- 
chometrically advantageous in normal eolica 
population testing with male maladjusted su 4 
jects, somewhat less useful with ral 
females and better adjusted males, and 


source of invalidity with better adjusted fe- 
males, 


SUMMARY 


Two hypotheses taken directly from Ne 
study by Smith (1959) and indirectly iene 
earlier investigators were tested in the ays 
ent study: (a) the K scale of the MMPI al 
a measure of psychological health in a ea 
Population, and (b) the K scale is eek 
measure of defensiveness in a normal popu o 
tion. To test Hypothesis I, the K scores 
two samples of maladjusted male (N = ! ice 
and female (N = 143) Counseling ae 
clients were compared with the K scores ol- 
male (N = 153) and female (N = 197) Pe 
lege normals. No signicant differences yri 
found in either comparison, although the e 
mal female group mean K score was "the 
ably higher than that of a subgroup ae . 
Most seriously maladjusted females (N = hy- 
Thus, there was some support for the ica 
pothesis that K is a measure of psycholog o 
health in a normal population in the cas? 
females only, ing K 

Hypothesis IT was tested by correla Te e 
scale scores with specially constructed Ad- 
and female Defensiveness scales for the 5 0 
jective Check List. Significant correlation (N 
-64 for male Counseling Service subject? ice 
= 109) and .35 for female Counseling Se jon 
subjects (VV = 103) support the assim ess 
that the K scale is a measure of defensiv? ja- 
for maladjusted subjects in a normal POP ve 
tion. However, when these relationships "nd 
determined for normal male (NV = 9 yiably 
female (VN = 141) college subjects, re de 
different Correlations between K and t 


— ay 


Significance of MMPI K Scale 


fensiveness measures were obtained (.35 and 
— .36, respectively). These correlational data 
tended to support Smith’s contention that the 
K scale is a better measure of defensiveness 
among more maladjusted subjects. 
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SELF-SATISFACTION AND PSYCHOLOGICAL ADJUSTMENT 
IN SCHIZOPHRENICS 


DENNIS K. KAMANO1 
Galesburg State Research Hospital, Illinois 


It is generally recognized that the satisfac- 
tion or concern of an individual with his phe- 
nomenal self represents an important aspect 
of psychological adjustment. For example, it 
has been demonstrated that marked dissatis- 
faction with one’s self is indicative of con- 
flicts or maladjustment (Cowen, Heiliger, & 
Axelrod, 1955), while positive and self-accept- 
ing attitudes towards the self are associated 
with good psychological adjustment (Mc- 
Quitty, 1950; Rogers, 1950). Most of these 
studies, however, were confined primarily to 
normal and psychoneurotic subjects, but the 
relationship between self-satisfaction and psy- 
chological adjustment in regard to other 
classes of individuals is not clear. It follows 

that a particular formulation found to be ap- 

plicable to one class of subjects may not be 


conception 
d view that 


pitalized patient (Arieti, 
1955). Similarly, it is granted that to admit 
satisfaction with one’s self is indicative of 
good adjustment in a normal individual, but 
is it a prognostically good sign when seen in 
hospitalized schizophrenic patients? The rela- 
tionship between self-satisfaction and psycho- 
logical adjustment is a complex One, and there 


is a need for further study and a qualified 
interpretation. 


It has been widely reco 
that schizophrenic patient 
their degree of expressed self-satisfaction and 
adaptive potential. In contrast to normal sub- 
jects, it has been widely recognized by cli- 


gnized by clinicians 
s differ markedly in 


1 The author wishes to express h 
Janet E. Drew and Vasso Vassilio 
ance in the collection of the data. 


is appreciation to 
u for their assist- 


nicians that with hospitalized schizophrenic 
patients at least, concern with one’s self os 
resents a more adaptive behavior than s e 
satisfaction when this is based upon al 
pressive and repressive mechanisms. Sot a 
patients reveal extreme self-satisfaction, be 
frequently observed behavior used by of 
tients to deny to themselves the ezen oe 
their discontent and pathology. Such pat) and 
are likely to be unrealistic, inflexible, a to 
resistant towards any forces hee Be 
disrupt such rigid self-definition to ae in 
extent that adaptive potential is gross Foti 
duced. Much depends, of course, on ene 
cept of adjustment one subscribes tO a 
Psychologists would agree in conside ne in 
Suppressive, repressive mode of adaptation, e- 
hospitalized schizophrenics as less than die 
quate. Such behavior may represent a nign 
tion sufficient enough for a stable and be the 
ospital environment where pressure OP one 
Patient never becomes too great, but ° 
which is incapable of manifesting 
flexibility in other situations. In a aac 
a person is adapted, as far as hoap 
Justment is concerned, but not adapta formt 
The above considerations led to ie whic 
lation of the following hypotheses wit oe : 
the present Study is principally es re 
Schizophrenic subjects revealing d sup” 
self-satisfaction wil] tend to deny WE 
press threatening features of thann di” 
such an extent that this will be refle 
their response to a personality 
That is, Schizophrenic subjects reve 
treme Self-satisfaction will reveal 
Call of unfavorable personality aie a per 
rom a passage designed to simula p schio, 
Sonality evaluation, as compared wit a 
Phrenic subjects not so characterize ` that Ps 
a this proposition is the corollary ijec 
tremely self-satisfied schizophrenic 


tion: 
evalua" se 


aling re- 
wer 
ov istic? 
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will tend to reveal higher recall of favorable 
items consistent with their highly favorable 
self-concept. 

2. Schizophrenic subjects revealing extreme 
Self-satisfaction based upon denial and re- 
Pressive mechanisms, represent a state of un- 
realistic self-appraisal and general reduction 
in their capacity for evaluation, and such re- 
duction in evaluative ability will be reflected 
in a situation requiring realistic appraisal of 
their performance from an objective frame of 
reference. That is, schizophrenic subjects re- 
Vealing extreme self-satisfaction will reveal 
Sreater discrepancy between their level of 
Performance and level of aspiration than 
schizophrenic subjects not so characterized. 
Schizophrenic subjects admitting some dis- 
Satisfaction with themselves will tend to set 
their level of aspiration more in relation to 
their actual level of performance and reveal 
lower discrepancy scores than extremely self- 
Satisfied schizophrenic subjects. 


METHOD 


Measure of Self-Satisfaction 

There are several ways in which self-regarding 
attitudes can be measured. One method is to use 
the semantic differential technique and to index the 
evaluation of the self-concept along the scale pro- 
vided by the subject’s own judgments of the con- 
cepts, my Actual Self (AS), my Ideal-Self (IS), 
and my Least-Liked Self (LLS), eg., the distance 
from AS to LLS as a ratio to the total distance 
from LLS to IS (Osgood, Suci, & Tannenbaum, 1957). 
This ratio, LLS-AS/LLS-IS, was used in this study 
as an index of self-satisfaction. The ratio, LLS-AS/ 
LLS-IS, approaches 1.00 as the location of AS ap- 
Proaches that of IS, i.e., as one’s self-satisfaction in- 
Creases. In other words, the value increases in size 
With self-satisfaction. , 

These self-concepts were rated on 15 bipolar pee 
Which were presumed to be relevant. The scales use 
'ncluded 6 representative of the evaluative factor (at- 
‘acting-repelling, complete-incomplete, a 
“nimportant, healthy-sick, high-low, and social T 
nnsociable), 5 for the potency factor (large-small, 
hard-soft, strong-weak, deep-shallow, and masculine- 
minine), and 4 for the activity factor (active-pas- 
Sive, hot-cold, tense-relaxed, and aggressive-defen- 

ve 


Sive) 


Subjects 


as Forty-four institutionalized white women Japeta 
Ti Schizophrenics were subjects in the present a 
© subjects were screened to determine that they 


wv 
Vere an sufficiently in contact and of adequate op- 
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erating intelligence to understand and complete the 
tasks involved. Self-satisfaction scores for the 44 
subjects were secured from the ratio, LLS-AS/LLS- 
IS, by averaging the ratio for each of the 15 scales. 
There were 17 subjects who received a score of 1.00 
or more, while 27 subjects received a score of less 
than 1.00. The high self-satisfaction (HS) group was 
composed of the 17 subjects who received a score of 
1.00 or more, while the low self-satisfaction (LS) 
group was composed of the 27 subjects who received 
a score of less than 1.00. The mean age for the HS 
group was 30.40 with a range of 19-38, while the 
mean age for the LS group was 27.30 with a range 
of 18-38. Both groups were matched on their im- 
mediate recall of a control passage, and the mean 
score for the HS group was 5.22 and for the LS 
group 5.12, a nonsignificant difference. Both groups 
were composed of chronic undifferentiated schizo- 
phrenics with only two chronic paranoid schizo- 
phrenics in the HS group and eight in the LS group. 


Procedure 


It is relevant to this experiment to note that two 
female assistants served in the various phases of the 
study. The recall phase of the experiment was con- 
ducted by one assistant, while the level of aspiration 
phase was conducted by the other assistant, Two 
different assistants were used in an effort to main- 
tain some degree of independence between the dif- 
ferent phases of the experiment proper. 

Each subject was examined individually. After a 
brief general discussion, the subject was presented a 
control passage secured from the Wechsler Memory 
scale, Form 1 (1945, p. 6), to match the subjects on 
their immediate recall. Following the presentation of 
the passage, each subject rated the concepts AS, IS, 
and LLS on the semantic scale presented in counter- 
balanced order. Two matched groups of 17 HS sub- 
jects and 27 LS subjects, respectively, were secured 
for the experiment proper. 

Recall series. This session was conducted 2-5 days 
after the initial phase of the study. For all subjects 
the following instruction was given: 


Remember the ratings of yourself that you com- 
pleted the last time? Well, as you know, they do 
reveal a lot of things about you. I have here an 
evaluation on what was revealed about you from 
the tests that you took. Of course, this will be 
strictly confidential. Listen very carefully while I 
read it to you because I want you to tell me as 
much about it as you can. 


Following the reading of the passage, the subject 
was instructed to repeat as much of it as she could 
and was assured that it did not have to be in the 
exact words. 

The experimenter read aloud the passages printed 
on a card. The experimenter practiced the reading 
prior to the experiment and found little difficulty in 
preserving uniformity from reading to reading. 

The experimental passage. The experimental pas- 
sage reproduced below has been subdivided into 
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divided 
i ing purposes. The passage was 

a fe easy ties Wo include one idea, but that 
ts recall of one item would not automatically in- 
clude the recall of another. 


an intelligent person/but you are un- 
ee as “pte Lie A are satisfied to 
do only enough to get by,/although you have the 
ability to do more./You do not always see things 
clearly,/even though you are capable of handling 
situations normally./You have good general knowl- 
edge,/and can assume tesponsibility./However, be- 
cause you feel insecure,/you are afraid to try new 
things./You are able to get along with people,/ 
but you are too easily offended./You could be self- 
sufficient,/but you prefer to be dependent upon 
others./ 


In order to note differences in the recall of types 
of items, each item was rated as “favorable” or “un- 
favorable.” The ratings were made by the author and 
two other independent raters. The Percentage agree- 
ment between the three independent raters was 
84.30%. In the case of discrepancies, the final rating 
was decided upon after joint conference, 

There were a total of 14 items, 


7 favorable and 7 
unfavorable. 


were not scored. 


Level of aspiration series, 
recall i i 


“You made a score of ‘st in 1 minute, 
Let’s try it again. Here is another test whi 

in the same way, bu 
An equivalent form of the Wechsler-Belleyue Digit 


Symbol test was used. After the sub’ 
the samples, the ex 
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Scoring. The deviation of estimate from perio aa 
ance was designated as the “D score.” In each ae 
the D score was the difference between the parlors 
ance or actual score made and the estimate folioa 
ing it. When a subject estimated higher tian | $ 
score she had earned, her D score was dimai a E 
positive. Whenever a subject estimated lower ma 
the previous performance, her D score was negati 
The D score provides not only a measure of A 
subject’s aspiration but also of her adjustment A 
the reality of her own performance. A low D seorg 
implies somewhat better contact between goal an 
accomplishment. o ee 

A second D score utilized was the difference he 
tween the estimate just made and the performa me 
following it. This measure reflects the level of a 
cess and failure in relation to the subject’s 8g 
setting, 

aol of apparent skewness in the D scores, the 


: nn- 
results were analyzed by the nonparametric Ma 
Whitney U test. 


RESULTS AND DISCUSSION 
Recall Series 


In Table 1 the HS and LS groups are am 
pared on the number of items recalled T 
the passage simulating a personality evalu 
tion. Our data indicate that the HS AE 
recalled significantly less items than the 
Sroup, both in total recall and in the valit 
of items reflecting unfavorable persona if- 
characteristics, There was no significant be- 
ference in the recall of favorable items HS 
tween the two groups. However, for the re- 
Sroup alone, more favorable items were 


rs 
called than unfavorable items. It aoe 
E as predicted, the schizophrenic subj 
wit 


i itudes 
extremely high self-regarding attitu 


A ps 
experimenter said: “How many of tended to deny and suppress unfavorable Pe 
these Gi you think you will be able to do in 1 sonality features of themselves to such a? m- 
i AA as recording the estimate, the subject tent that this was reflected in their perf" s 
was allowed to perform on the test, ance. Since schizophrenic subjects with © 
TABLE 1 
Comparison OF MEAN RECALL Scores 
HS recalls 
Type of item Mean SD 
Mean SD t 

Favorable 1.35 

Unfavorable 94 - a 1.13 1:31 

Total 2.29 1.45 ce avi ee 

* Significant at the .05 level, one-tailed test, 


De 


a 
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treme self-satisfaction tend to resist self- 
evaluation or any external promptings that 
may disrupt such self-definition they have 
adopted, the threat of the test situation un- 
doubtedly contributed not only to the reduc- 
tion in recall of unfavorable items but in the 
recall of favorable items as well. It is possible 
that, once developed such self-definition re- 
sist change and represents a prognostically 
Poor sign for therapy. The question of ther- 
apy with such subjects invites further study. 


Level of Aspiration Series 


To begin with, the HS and LS groups were 
compared on their initial performance on the 
Digit Symbol test. The HS group obtained a 
mean score of 25.30 and the LS group a mean 
Score of 26.00, a nonsignificant difference. In 
this respect, the two groups were matched in 
their performance on the Digit Symbol test, 
and this lessened the possibility of the effects 
of differential ability on the results of this 
Phase of the study. 

The discrepancy between performance and 
estimate (D score), with signs disregarded 
(ie., positive or negative direction of esti- 
mates), provided a measure of each subject’s 
adjustment to the reality of her own perform- 
ance. The results of this analysis are given in 
Table 2. D scores secured from the discrep- 
ancies between Trial 1 and estimate revealed 
significantly higher D scores for the HS group 
as compared with the LS group. There were 
Significantly greater discrepancies between 
the actual scores made on the first trial and 
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TABLE 2 


MEAN DISCREPANCIES AND RESULTS OF APPLYING THE 
D -WHITNEY U TEST TO ARRAYS OF DISCREP- 
ANCY SCORES BETWEEN THE HS anp LS Groups 


HS group LS group 


D score Mean Mean s score 
Trial 1 and estimate 9.71 3.60 2.61** 
Estimate and Trial 2 9.71 4.96 237" 
Trial 1 and Trial 2 3.06 3.67 85 


* Significant at the .05 level. 
** Significant at the .01 level. 


the estimates following it for the HS group 
as compared with the LS group. 

D scores secured from the discrepancies 
between the estimates and subsequent per- 
formances (Trial 2) also revealed significantly 
higher D scores for the HS group as com- 
pared with the LS group. The discrepancies 
between the estimates made and the perform- 
ances following it were significantly greater 
for the HS group than for the LS group. 
Analysis of the differences between Trial 1 
and Trial 2 on the Digit Symbol test yielded 
no significant differences between the two 
groups. 

Table 3 presents an indication of the per- 
centages of subjects in the HS and LS groups 
in terms of the direction of their discrepancy 
scores. The D score between Trial 1 and esti- 
mate was designated as positive if the esti- 
mate following Trial 1 was higher, negative 
if lower, and zero if there was no change. 
The D score between estimate and Trial 2 


TABLE 3 


PERCENTAGE OF DIFFERENCES IN DIRECTION oF Discrepancy Scores 


HS group LS group Chisquare p 


D score 
Positive 35.29 33.33 i 

Trial 1 and estimate ae Ee To 5.03 10 
Positive 64.71 62.96 i 

Estimate and Trial 2 pe ne es 3.19 25 
Positive 47.06 74.07 

Trial 1 and Trial 2 Negative at ve 4.52 .20 


Zero 
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igna positive if Trial 2 follow- 

nk eoo RAA higher, negative if lower, 
ie zero if there was no change. Similarly, 
F D score was designated as positive if 
Trial 2 was higher than Trial i; etc. Analyses 
of the percentages of directional differences by 
the chi square method revealed no significant 
differences between the two groups in the 
three conditions. That is, there were no sig- 
nificant differences between the HS and LS 
groups in the percentages of subjects showing 
positive or negative D scores. o 
The results of this phase of the study indi- 
cate that the differences between the HS and 


LS groups resulted not from the actual per- 


the D score, 
or aspiration 
gnificant dif- 
subjects re- 


g their estimate 


than the HS group. In a s 

was capable of manifesting adaptive flexibil- 

ity in such situations toa 

the HS group. 
Further comments 


appear indicated in re- 
gard to the compositi 


on of the HS group of 
this study, since it would seem anomalous for 


someone to evaluate his actual self higher 
than his ideal-self as diq Some of the schizo- 
phrenic subjects in the HS group. It would, 
indeed, be very unusual for someone to evalu- 
ate his actual self higher than his ideal-self, 
but such occurrences should be expected from 
some hospitalized schizophrenic patients, A 
frequently observed phenomenon in the clini- 
cal setting, are some schizophrenic patients 
revealing extremely high self-regarding atti- 
tudes who deny to themselyes and to others 


Dennis K. Kamano 


the extent of their pathology and discontent. 
Such schizophrenic subjects have an pe 
ally high self-concept which, of course, ae 
sents an unrealistic self-appraisal. It es, 
not be surprising, then, that some of ne 
schizophrenic subjects comprising the A 
group rated their actual selves higher t p 
their ideal-selves, However, further study = 
needed to clarify the significance of Ta 
discrepancies reflected in the self-satisfactio 
index of such subjects. 


SUMMARY 


The present study sought to test two iy 
potheses: (a) schizophrenic subjects i 
ing extreme self-satisfaction tend to deny — 
Suppress threatening features of hen gti 
to such an extent that they will recall i 
items reflecting unfavorable personality me 
acteristics from a Passage designed to simu E 
a personality evaluation, than Sgluzophren a 
subjects admitting some dissatisfaction wi ne 
themselves, and (b) schizophrenic sublata 
revealing extreme self-satisfaction will ig 
greater discrepancy between their leve han 
Performance and level of aspiration t d 
schizophrenic Subjects not so cherie 
Both hypotheses Were supported when et 
on a sample of 44 hospitalized schizophre ti 
Women. Implications were drawn with rega 
to Psychologica] adjustment. 
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One of the key problems in the employ- 
ment of projective techniques is the inter- 
Pretation of the content of a TAT story with 
reference to possible behavioral correlates. 
For example, assume “A” and SB both 
manifest an equal amount of hostility in their 
protocols to two different series of TAT pic- 
tures. Assume further that the hostility score 
employed is a sophisticated one taking cog- 
nizance of the inhibitions to aggression in 
the stories as well as to the direct expression 
of hostility. Does the equality of scores for 
A and B provide any foundation for the pre- 
diction that the behavioral correlates of their 
scores should be the same? Hardly, unless we 
can ascertain that the hostility cards in the 
series for each person are equal in their hos- 
tility-educing properties, and that the over- 
all scores represent equal deviations from the 
stimulus properties of each card for each 
story. This latter statement is necessary since 
if A obtained his hostility score from telling 
chiefly hostile stories to nonhostile cards and 
nonhostile stories to hostile cards, the inter- 
Þretative significance might differ from that 
attributed to B who followed the stimulus 
Properties of the cards closely in giving hos- 
tile stories to hostile cards and nonhostile 
Stories to nonhostile cards. Thus, the impli- 
Cations for interpretation may differ even 
though the overall scores are equivalent. This 


* The authors would like to thank Carol Bowdish, 
Walter Coulter, and Irvin Hansen for their contribu- 
ton in the collection of data. 


topic has been dealt with more fully else- 
where (Murstein, 1961). 

It is apparent that the relationship of re- 
sponses on the TAT to the stimulus quali- 
ties of the cards may have important be- 
havioral correlates which are helpful in the 
assessment of personality. Though several 
studies have been undertaken applying scal- 
ing techniques to thematic cards (Auld, Eron, 
& Laffal, 1955; Lesser, 1958) none have 
scaled the TAT in its entirety, Moreover, the 
previous studies employed only the Guttman 
technique. One might ask whether a series of 
scaling devices currently employed in measur- 
ing attitudes could be used to determine the 
stimulus properties of the entire series of 
TAT cards? If the stimulus value of the cards 
could be ascertained, then the relationship 
between the subject’s response to the cards 
and the stimulus value might provide mean- 
ingful inroads into the study of personality, 
Our specific question therefore was formu- 
lated as follows: can a scale of hostility be 
constructed by each of the following scaling 
devices: Thurstone Equal Appearing Interval 
method (EAI); Successive Categories method 
(SC); Likert method; Edwards Scale Dis- 
crimination technique; and the Stouffer, Bor- 
gatta, Hays, and Henry H-technique? 


PROCEDURE 


Subjects were composed of 100 University of Port- 
land undergraduate students, obtained from the vari- 
ous sections in general psychology. There were an 
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198 


50) and women (50) for the 
ae mimber te Likert, Edwards, m 
hoe “ale nads were applied to 42 men and : 
a Th same subjects participated in each scal- 
parey P vith the exception that there were eight 
sg memg ad eight additional women present for 
he Likert, Guttman, Edwards, and H-technique 
e > 
TAL and SC methods differ primarily in that 
S er assumes that equal intervals exist for the 
ey ar categories while the latter method does 
es ae this assumption but instead measures the 
jee al width of the intervals between the cards. 
oe data for both methods, however, may be ob- 
tind in the same manner. Accordingly, groups of 
four to six subjects were instructed to stand be- 
fore a table upon which was placed nine sheets of 
white paper, numbered one through nine, to repre- 
sent nine categories of judgment. They were pre- 
sented with the 31 TAT cards in random order and 
given these instructions: 


You will be shown a series of 31 pictures which 
you are to judge objectively for the amount of 
hostility shown. By hostility I mean unfriendli- 
ness, anger, the desire to hurt either physically or 
mentally. The expression of hostility can vary from 

` barely noticeable to extremely intense. It may be 
directed towards people, animals, objects, 
ing in particular. Your task is to judg 
cards according to the amount of 
on each card. In front of you are numbers from 
one to nine which represent a continuum from the 
least amount of hostility to the greatest amount. 
Thus, the least hostile card would be put in pile 
Number 1 while the most hostile card would be 
put in pile Number 9, 

Pile Number 5 represents the 
separating the more than ave: 
from the less than average hos 
ample, a card which seemed 
average, but not extremely ho: 
in a pile higher than the midp: 
one of the extreme piles, 

Remember you are to judge these c 
ing to the amount of hostilit 
possess, not how you personall 
Do not forget to judge the bl 
any questions? 


or noth- 
e the 31 
hostility shown 


midpoint category 
rage hostile cards 
tile cards. For ex- 
more hostile than 
stile might be put 
oint but yet not in 


ards accord- 
y they objectively 
y feel about them. 
ank card. Are there 


The judgments were tabulated on 
after each subject had completed the task. The data 
for the Likert, Edwards, and H-techniques were ob- 
tained in a group situation. With about 30 subjects 
seated in a classroom on cach occasion, subjects were 
given the following instructions: 


a data sheet 


Now I would like you to look ove 
individually and tell me how hostile 
be [hostility was defined in the same way as it 
was defined in the instructions for the aforemen- 
tioned methods]. For each card I show you, check 
to see that you have the right number of the card 
as I call it out and then looking at the picture, 


r each card 
it seems to 
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circle the phrase which best describes how io 
the picture appears. You have five aoe (a) 
very hostile, (2) fairly hostile, (3) undecided, 
little hostility, (5) not hostile. 


Slides containing facsimilies of the 31 TAT cardi 
were presented to the three groups of SPor prô- 
30 students, one slide at a time. Each slide Wi Ae 
jected on the screen for 45 seconds, during oughly 
time the subject recorded his judgment. Tot Tie 
control for serial position, one group ye 
pictures in numerical order, the second in ih the 
numerical order, and the last by starting wil nting 
middle numbered card and successively prese pe 
the following cards in descending order am A 
side of the middle card. Thus, Card 11, the i giet, 
card in the series of 31 cards, was pemn Be 
followed by Card 10 (fifteenth card) and the 
Card 12M (seventeenth card), etc. 


RESULTS 


Since there appeared to be little ae 
between the judgments of the sexes, the oa 
cessive categories values for both sexes aii 
correlated. The r of .93 indicated HO se as 
why the judgments could not be pee 
coming from the same population of Jines 
ments. Accordingly, the median scale mete 
(S) for the EAT, SC, mean Likert eng ogartt h 
the EAT interquartile deviations (Q) for ble 
card for the total group, are listed in T@ 
together with the Likert ¢ values. lied 

A test of internal consistency was arr ine 
to the values obtained via SC to a r 
Whether the assumptions for scaling 


the 
Supported. These assumptions were: a) dis- 
projection of the cumulative proporti on 


. . i a 
tributions for the various cards is pa psy” 
the Psychological continuum; (b) ais i 
chological dimension scaled is unidimen dis- 


: e 
and (c) the standard deviations of a 
criminal dispersions are equal. In ust that 
x” test, a 


fourth assumption made fl muli 
there is zero correlation among the J epent” 
Since the proportions used must be in! alfor i 
ent of each other (Edwards, 1957; silfo" 
1954). The x" formula suggested by € 

(1954, p, 232) was employed, which n, 210 
the large number of degrees of freedom! ‘yatio- 
Was transformed into an approximate jy sig 
The ¢ value was 24.20 which was big’ > vent 
nificant beyond the .001 level. It is apr tions 
that one or more of the multiple rao e 
made in scaling the cards was unjust! vas, i 
determine how important this finding ` 


in n, 


aa 


Ss 
La m 
eee 


XE we Ss ea a a o o 
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TABLE 1 
EQUAL APPEARING INTERVAL, SUCCESSIVE CATEGORY, Q, AND ¢ VALUES For Eacu TAT Carp 


ł values 
for highest 
quartile vs. 


Equal 
appearing Equal lowest 
interval Successive appearing Likert quartile 
scale categories interval Q mean Likert 

Card value scale value value value judgment p 
12BG 1.13 -130 3 Lil 2.00 .05 
16 1.19 190 1.07 1.20 55 ns 
10 1,27 262 4.07 OL 

SGF 141 399 1.94 05 

9BM 1.67 580 2.45 01 
13G 2.31 -609 2,89 -05 
14 2.80 .629 2.38 .05 
17BM 2.89 659 3.17 OL 

2 3.00 -696 3.55 OL 
13B 3.11 734 5.45 01 

1 3.60 -896 2.23 .05 

7GF 3.78 947 2.54 01 

5 443 1.152 1.86 .05 

7BM 4.56 1.192 2.00 .05 
19 4.62 1.216 2.77 01 

6GF 4.76 1.263 4.53 OL 
17 GE 4.77 1.266 2.75 01 
127 5.23 1.424 3.74 01 

6BM 5.50 1.516 4.13 01 
20 5.56 1.538 2.36 05 

4 5.88 1.645 2.25 05 

SBM 5.90 1.650 3.78 ‘OL 
oGr 5.98 1.676 271 ‘ol 
12M 6.11 1.763 2.11 05 

3BM 6.70 1,924 351 ‘Ol 
11 6.79 1.968 1.53 ns 

3Gl 7.03 2.039 4.91 01 
18BM 7.82 2.436 2.90 01 
15 8.12 2.599 1.81 05 
13MF 8.15 2.616 3.75 01 
18G 8.32 2.712 1.85 .05 


Obtained the size of the discrepancy between 
the theoretical and empirical proportions of 
judgments for each of the cards in the vari- 
Ous categories. The theoretical proportions 
Were obtained by taking the scale value of 
each of the 31 cards and subtracting it from 
€ach of the cumulative interval widths (Ed- 
Wards, 1957). This yielded a 31 X 8 matrix 
theoretical deviates with the columns rep- 
“esenting the cumulative interval widths and 
e rows the various cards. By reference to 
the table of the normal curve these values 
€re transformed into theoretical cumulative 
‘Oportions, Each of these proportions based 
uly on the knowledge of the interval widths 


and the scale values of the cards was com- 
pared with its empirical counterpart. The av- 
erage value for the discrepancy between the 
248 theoretical and empirical proportions (31 
cards X 8 categories) was .038. This value is 
not exceedingly large, and indicates that the 
degree of lack of internal consistency in the 
scaling was not great although the confidence 
in the significance of the disparity is very 
high. 

The Likert values were obtained by taking 
those individuals whose overall hostility score 
placed them in the top quartile and compar- 
ing their scores for each card with those per- 
sons whose score placed them in the lowest 
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TABLE 2 


EQUAL APPEARING INTERVAL SCALE VALUES, Q VALUES, 
aes t VALUES FOR NINE CARDS SELECTED FoR TEST 
A OF UNIDIMENSIONALITY 


Scale 
Card value Q t 
10 1.27 1.14 4.67 
13G 2.31 1.35 2.89 
13B 3.11 1.39 5.45 
TGF 3.78 1.66 2.54 
6GF 4.76 1.52 4.53 
9GF 5.98 69 2.71 
3GF 7.03 1.39 4.91 
18BM 7.82 1.06 2.90 
13MF 8.15 1.17 3.75 


quartile. A ¢ value was then obtained be- 
tween the groups with regard to their scores 
on each card. Table 1 indicates that 11 of 
the 31 cards proved to be significantly dif- 
ferentiated at the .05 point, 18 at the 01 


point, and only 2 proved to be not significant 
at all. 


The Edwards Scale Discr 
nique was utilized as follows 

1. The 15 cards abo 
were discarded, 

2. The cards whic 
tive of the entir 
from the EAI 

3. Those ca 


imination Tech- 


ve the median Q value 


TABLE 3 


RESPONSES To CARD 13MF as RELATED TO 
THE OVERALL SCORE FOR THE Ning Carns 


Card 13 MF 
ee 
Total score 0 1 2 
17-18 0 0 4 
15-16 0 1 10 
14 1 0 14 
13 0 1 10 
12 1 1 12 
11 0 3 8 
10 1 0 15 
8-9 2 1 6 
1-7 6 0 3 
Sum 11 7 89 
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The result of this analysis was the sleg 
tion of nine cards which were to be tested i 
unidimensionality via the coefficient me, 
producibility. These cards from low to a8 
stimulus value of hostility were 10, 13G, MF. 
IGF, 6GF, 9GF, 3GF, 18BM, and 13) r 
The EAI, S, Q, and ¢ values for each of the 
cards are shown in Table 2. ‘ 

The Stouffer, Borgatta, Hays, and me 
H-technique (1952) was used to deter 
the coefficient of reproducibility. The Li A 
responses (very hostile, fairly hostile, un “ | 
cided, little hostility, not hostile) were oe 
lapsed into three judgments: hostile, ei. 
cided, and not hostile, which received bene 
of 2, 1, and 0, respectively. The agen of 
of total scores was then obtained for e i 
the nine cards, using the weights assigne 


TABLE 4 re 

CUTTING Por SELECTED FOR con oe | 
Waict! MEETS CRITERIA FOR SELECTION Í 
VIA H-TECHNIQUE | 


Hostile 


hostile s 
Non response 


Score for judgments response 


of all nine cards 0,1 : l 
3 
>10 9a a 
<10 9 


ting 
Care 13MF with errot 
itions: (4) P encies" 
2 fulfills the following conditions: (0) uenc 
an the smaller of the nonerror ao, 
ne total error percentage is less than 3 


Note.—a re Presents error cells, 
Point of 0,1 and isthe follow 


cell is greater th; 
and (b) th 


indi- 
the three res ie: f 


ies. Tab in 0 
ponse categor ‘onship es 


Cates by way of example, the relat rds 
the total response score on the nine spol ing 

the score obtained for Card 13MF. UP oint? $ 
section of this table the best cutting P 
for the two Possibilities 0, 1 vs. 2 T 
1, 2 were Selected, bearing in mind cell ba? 

teria. There were (a) neither error haid 


: p0 
aher frequency than the smaller onal 


two frequencies on the principal the ies 
(nonerror), and (b) the sum of o 
quencies 


in the two error cells aion js 
© of the total frequency. AP MF Ë | 
ple of one of these splits for card l 
shown in Table 4, a with ea 
Y using more than one split wonders 4 
card the nine original cards were “onti 
“contrived” cards, Each 
ained a “triplet,” ies t 


Se 
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TABLE 5 
TAT CARDS AND CUTTING POINTS USED IN THE CONSTRUCTION OF CONTRIVED CARDS 


(N = 100) 
Response Weight Frequency of 
SSS hostile Contrived 
Card Negative Positive judgments cards 
1SBM 0 1,2 93 — 
1SBM 0,1 2 91 I 
13MF 0 1,2 89 —b 
13ME 0,1 2 82 I 
3GF 0 1,2 81 I 
9GF 0 1,2 77 —e 
3GF 0,1 2 75 II 
oGr 0 1,2 74 II 
9Gr 0,1 2 70 II 
6Gr 0,1 2 67 — 
7GF 0 1,2 55 end 
7GF 0,1 2 35 HI 
13B 0 1,2 35 Ill 
13G 0 1,2 24 Ils 
13B 0,1 2 22 IVe 
10 0 1,2 15 —be 
10 0,1 2 9 IVe 
13G 0,1 2 8 Ive 
^ Card with this cutting point is closer to the end of the scale than is desireable. 
b Card not used because it appears with another cut in same contrived card. 
dagen with this cutting point has error cell with greater frequency than the smaller of the two frequencies on the principal 


4 Sum of both error cells greater than 30% of responses. 


with prescribed cutting points which indicate criteria for selection, are listed in Table 5 
which judgment or judgments are to be con- along with the various cutting points, re- 
sidered as a “hostile” choice, and which “non- sponse category weights, the frequency (out 


hostile.” Since there are two possible adjacent 
Cutting points (0) vs. (1, 2) and (0, 1) vs. : TABLE 6 
(2), each card could be used more than once Bar ASSIGNED To EACH SCALE AND 
if desired, using a different cutting point in ONSCALE TYPE viA H-TECHNIQUE 
€ach case. In order for a contrived card to be z 
judged hostile, the responses to two or more Response pattern Frequency 
Of the members of the triplet must be judged + + + +e 3 
hostile. With the number of possible scale F + ca T ? 
types limited to five, the resulting coefficient p poa z 
of reproducibility was .965. This value is 4 2 o Ep 21 
slightly inflated due to the fact that in choos- - =- + + 0 
g our cutting point we have taken advan- - #1 + a 0 
äge of favorable sampling errors. Neverthe- 4 eae i 
“Ss, the coefficient is sufficiently high to con- db. ae am a 52 
“lude that the responses to the cards can be i - + - 2 
Teproduced with a satisfactorily high degree = ry i o 
accuracy from a knowledge of the total E TA : 
res alone, = =x ae 10 
ett should be noted that not all cards se- 5S See 7 
“Cted were able to fulfill Condition @ men- ie 


ti s 3 
ented above, Those cards not meeting this 
terion, along with those not meeting other a Scale types. 
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O subjects) with which the card in a 
eat le Split was judged hostile, and the 
contrived card into which the card was placed. 

It is evident from examining this table that 
it was not possible to obtain a good split for 
a card which is judged hostile by 50% of the 
subjects. This failure is similar to that usu- 
ally experienced with the Thurstone meth- 
ods in attempting to get good differentiating 
items, or items with low Q values which at 
the same time represent the middle of the 
psychological continuum. 

With four contrived cards there are 16 pos- 
sible scores or types of response patterns, of 
which 5 may be designated as scale types and 
11 nonscale types. The frequency with which 
each type was found is listed in Table 6. 
Examination of this table indicates that only 
7 persons out of 100 are nonscale types. 

Last, the values obtained for the 31 cards 
via the EAI, SC, and Likert methods were in- 
tercorrelated. The resulting correlations were: 
EAT vs. SC, .99; EAT vs. Likert, 94; and 
Likert vs. SC, .92. 


Discussion 


questions, and like 
many others. Apparently, 


ments. 

There are, however, fur 
It is apparent from a per 
the differential ability o 
gard to separating high 
from low hostility percei 
dependent upon the scale 
There are several instan 
are perceived as nearly equivalent on the di- 
mension of hostility and yet one card is able 
to differentiate the aforementioned high and 
low hostility perceivers while the other is not. 
For example, Card 10 is given an EAT scaled 
value of 1.27 which is nearly identica] to the 
scaled value of 1.19 received by Card 16. The 


ther Considerations. 
usal of Table 1 that 
f the cards with re- 
hostility Perceivers 
vers is not greatly 
d value of the card, 
ces where two cards 
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t value of Card 10, however, was 4.67 while 
that of Card 16 was .55. There are ser 
possible explanations. One is that some cards 
contain many possible dimensions in their 
stimulus characteristics. High hostility per- 
ceivers (upper quartile in the Likert method), 
however, are more sensitive to the hostile pos 
sibilities of the card than to other motives 
such as achievement, sex, and affiliation, to 
name just a few. Low hostility perceivers 
(lower quartile in the Likert method), bor, 
ever, are probably either disposed to appe 
seeing hostility or perhaps simply able to per 
ceive the other dimensions as more strongly 
characterizing the picture. Card 10, which fe 
described by Murray (1943) as “a youre 
Woman’s head against a man’s shoulder (P 
19) would seem to be a multidimensional pir 
ture. There are many plausible explanation 
for the embrace, some positive, others of 
tive. Card 16, however, is a picture devoid 4 
any motives from a stimulus point of aa 
Hence, high hostility perceivers cannot nee 
any alternative except to perceive the ore 
as nonhostile. To do otherwise is to eer 
sharply from the stimulus possibilities of t 


. se 
card. The low hostility perceivers likew! 


-o little 
would be naturally expected to perceive i 
hostility, 


It is thus conceivable that the number 0% 
ternative themes that can be perceived ity, 
cture will determine its differential ani in 
he greater the number of possible then magn 
à card, the greater the differentiation bet’ one 
the judgments of persons high and low 0” 
of the dimensions of the card. n 

It also follows that the greater the ot 
of themes, the less likely a single mot om 
to receive a uniformly high judgment 


ss tio” 
all subjects, The reason for this assump ni- 
Is that not all S 


t si 
nant motive since they may be man a a 
tized to other motives. The result is t a 


, A on 
overall saliency of a motive is not 
function of 


tive, b 


al 
pi 


ber 


ubjects will perceive the nsi- 


mo 
the stimulus impact of t om 
> Dut a function of the number ° 
peting motives as well. 

How then can one be sure as to 1 
these methods of judging is employe" jvid 
Subject? One answer is to have m a 
a à picture for all possible moti” 

er 


ja the” 
Ceptual ambiguity level cou 


hich 


| 
| 


d 
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determined (Kenny, 1961) by the equation 
A= 1— xp(i) *, Where -1 equals the percep- 
tual ambiguity of the picture and p(i) equals 


` the proportion of any 7 motive appearing in 


the picture. 
A will be at a maximum when the propor- 
tions of all motives are equal (i.e., Motive 
= .33, Motive B = 33, Motive C = .33; 
4 = .67). A will decrease as the split between 
the Proportion of two or more motives widens 
(i.e., Motive A = .50, Motive B = .45, Mo- 
tive C = .05; A = 45). Current work on the 
differentiating value of the cards as a func- 
tion of A is underway and will be reported 
in future articles. : 
Another problem with scaling procedures is 
that they usually have failed to incorporate 
the presence of inhibitory factors into the 
Scaled values. If two people perceive an equal 
amount of hostility yet differ in the inhibi- 
tions expressed with regard to this hostility, 
their personalities might differ radically. Yet, 
our scaled values along with all other studies 
involving scaling of thematic stories would 
fail to take cognizance of this fact. It would 
seem, therefore, that the prediction of overt 
behavior from a knowledge of the scaled 
value of pictures might be improved if the 
scaled values reflected the multidimension- 
ality of the stimulus properties of the pic- 
ture. Perhaps some of the newer multidimen- 
sional scaling devices (Torgerson, 1958) may 
Prove to be of greater value than the older 
Methods. , 
The large variability of the Q values might 
tend to make one believe that Q might be re- 
Sarded as an index of projection for a pic- 
ture. Pictures with low Q values might be poor 
Pictures for projective purposes while a high 
mount of variability for the objective dimen- 
Sionality of a picture (high Q), might be con- 
Sidered a good index of projection. Little sup- 
Dort, however, is given for the belief that high 
*nd low perceivers of hostility are differenti- 
àted by high Ọ variables if one examines 
Mable 1, Two factors may serve to explain 
this result. First, a low Q value for the judg- 
Ments of one dimension may become a high 
~= Value when other dimensions are considered 
y the same picture. Secondly, a card may 
Not be relevant for a given dimension. Ac- 
“ordingly, high Q values may result not be- 
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cause individuals differ as to the placement 
of the card on a dimension, but because they 
differ as to whether or not the picture be- 
longs in the dimension continuum at all. Un- 
der these circumstances, a high Q may merely 
reflect a high amount of error variance in the 
judgment due to the random assignment of 
values to a card when it does not seem ap- 
plicable to a dimension. Thus, for Q to be 
regarded as an accurate measure of projection 
we should ascertain that the card is relevant 
to the dimension being judged, and that it 
taps this dimension only, Since it is doubtful 
that the current TAT cards meet both cri- 
teria completely, the use of Q as a projective 
index would seem to have some drawbacks. 

There are obviously a good many difficul- 
ties in the scaling of thematic cards and the 
question may arise as to whether it is worth 
all of the trouble. Is a knowledge of the 
stimulus value important? Cannot a skilled 
clinician “get along” without knowing the 
stimulus value of the cards? It is our belief 
that behavior may be viewed as the pooled 
interaction of stimulus, background, and or- 
ganismic variables, a view very close to that 
expressed by Helson (1955). From this frame 
of reference, knowledge of the stimulus prop- 
erties seems essential to the accurate predic- 
tion of behavior. The “good” clinician prob- 
ably carries in his head a normative index of 
responses to each of the TAT pictures which 
serves as a rough estimate of the stimulus 
value. But, the clinician limited by his own 
experience probably could not achieve the ac- 
curacy of estimation that the actual meas- 
urement of a well standardized group would 
achieve. 

Yet another important factor is the deter- 
mination of the relationship between the 
stimulus properties of a card, the response 
elicited, and the possible behavioral correlate. 
This important area has been untouched by 
psychologists because of a lack of knowledge 
of the stimulus properties of the cards (Mur- 
stein, 1959). It should be possible shortly, to 
determine the relationship, if any, between 
perceptual deviancy and behavioral deviancy. 
It is not assumed that maladjustment is a 
simple linear function of the discrepancy be- 
tween the stimulus properties of a picture and 
the story told to it. These may be a curvi- 
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linear, hyperbolic, parabolic relationship, or 
perhaps, no relationship. The Cetenniination 
of an answer to this important question only 
awaits our quantification of Mie A 
ies of our projective instruments. 
e nA it should be emphasized that 
the values obtained may not hold for other 
kinds of students at other locales. In fact 
serial position effects with each scaling 
method as well as between the methods have 
not been adequately controlled. To do so 
would have involved a great number of sub- 
jects, which, while desirable, was not prac- 
tical, not directly pertinent to the rather 
broad purpose of this study. Such refine- 
ments should, however, be utilized where the 
scale values themselves are of concern rather 


than the question of whether scaling itself 
can be achieved. 


SUMMARY 


The purpose of this st 
whether the entire set o 
be scaled for the di 
through the use of sev 
ing methods. 

A group of 100 undergraduate ps 


students were administered the T 
via slide Projections a 


udy was to determine 
f 31 TAT cards could 
mension of hostility 
eral widely used scal- 


Stourrrr, S. A, 


Torcrrson, 


Murstein, David, Fisher, and Furth 


13MF. The coefficient of reproducibility k 
these cards using the H-technique memi a 
“contrived cards” was .96. It was conclu id 
that all of the aforementioned methods ne 
be used in scaling the dimension of hosii Ss 
T'2 implications of the results with rega 


= ity were 
to future work in the area of personality 
discussed, 
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In recent years, an increasing amount of 
attention has been devoted to the problem of 
duration of stay in psychotherapy. Reports 
from many diverse types of clinical settings 
have indicated that early discontinuation in 
Outpatient psychotherapy is a reliable find- 
ing of some importance (Affleck & Mednick, 
1959; Garfield & Kurz, 1952; Rogers, 1960). 
Several studies have attempted to appraise 
Selected patient variables related to and 
Predictive of continuation in psychotherapy 
(Rosenthal & Frank, 1958; Rubinstein & 
Lorr, 1956; Sullivan, Miller, & Smelser, 
1958; Taulbee, 1958). With the possible ex- 
Ception of educational level, the findings on 
most of these variables have been inconsist- 
ent. Research in our setting on patient at- 
tributes related to termination indicated that 
Î diagnosis, sex, age, and education were not 

significantly related to duration of stay (Gar- 

field & Affleck, 1959). Education below a cer- 
tain minimal point may be important, but we 
| found no evidence to indicate its usefulness 
| as a predictor with persons who have gone 
beyond the eighth grade in school. ; 
The general failure to relate broad patient 
| Variables to attrition led to an interest in the 
| therapist as a variable affecting attrition rates. 
While it is apparent that the interaction be- 
tween the individual patient and the indi- 
Vidual therapist is exceedingly important hr 
€ problem of attrition, we were intereste 
first in getting a better understanding of the 
Srientation that therapists have toward candi- 
ates for therapy in general. Are there oar 
mon points of view toward patients? Wha 
Patients are initially viewed in a highly fa- 
Vorable way? For what reasons is this pe 
“ase? Which patients are seen negatively: 


Oo p 


i f the 
Ay, Presented in part at the Annual Meeting 0 


emp can Psychological Association, Chicago, Sep- 
mber 1960, 


THERAPISTS’ JUDGMENTS CONCERNING PATIENTS 
CONSIDERED FOR PSYCHOTHERAPY: 


SOL L. GARFIELD ann D. C. AFFLECK 


University of Nebraska College of Medicine 


Are anxiety and defensiveness related to the 
judgments and attitudes therapists have to- 
ward patients? These were some of the ques- 
tions that led to an initial exploratory study 
of therapists’ attitudes toward therapy candi- 
dates. This in turn was part of a larger study 
of variables related to continuation and prog- 
ress in psychotherapy. 


PRESENT STUDY 


In this investigation, therapists were asked 
to complete a brief questionnaire and check- 
list at staff meetings at which cases were 
discussed and considered for outpatient psy- 
chotherapy. The questionnaire included open- 
ended questions on assets, deficiencies, goals 
in therapy, and likely problems in therapy. 
Each therapist also was asked to rate each 
patient on a four-point scale in terms of thera- 
peutic prognosis—excellent, good, fair, or 
poor. Similar ratings were requested concern- 
ing the degree of anxiety in the patient, the 
latter’s defensiveness or rigidity, the rater’s 
personal feelings toward the patient, and the 
rater’s interest in taking the patient on for 
psychotherapy. 

The ratings were secured jn a regular out- 
patient staff meeting which met weekly for 2 
hours. Two to three cases were discussed at 
each meeting. These cases had been seen 
previously by a psychiatric resident and So- 
cial worker, and in about one-half of the cases 
by a psychologist. The intake reports were all 
read in their entirety. After the intake ma- 
terial was presented to the staff, but prior to 
any discussion of the case, each of the indi- 
viduals at the staff meeting was asked to fill 
out the questionnaire. 

Responses were secured from 20 different 
therapists from three disciplines: psychiatry, 
clinical psychology, and psychiatric social 
work. The number of patients rated and 
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TABLE 1 
CORRELATIONS BETWEEN RATED PATIENT VARIABLES — 
rae Correlation coefticient = 
; y Ë -E 
ey AB AC AD AE BC B-D CD D 
apis! : j Ab — 
23 03 S19 72 att 13(19)  .09(22) —.33(18) yee 
> 23 18 —.62(24)** 49024)" (21024) — 0) 33° °~C ae 
Ae ae —.70"* 55 Sie 07 53 30" T 
hs & 54" 66% e D 13 = 26 ao 
5 18 66016)" —50(13) .69(17)** “gue Mas sw soaa, oors 
e a7 26 —91(16)** 82% 77 igi) 46 -eoe 780 
7 27 :38* —.26 .76** .65** 16 20 — 28 pa 
8 12 42 —.10 45 32 —.13 Al —.76* ae 
9 20 ‘S1* —.33 .74** .73** 00 52 sds el 
10 31 .68** 13 .60** 70** .25 64** -12 cg a 
11 17 4016)  —.04(15) “54+ 55% = .15(14) 25616) —.54(15) 
12 29 05 —.53** .73** 26 20 22 ig N 
3 4 8B —56* 54 18 10 —.14 —.76** 2 
Median Correlation 63 
38 —53 66 65 —.07 35 E „a 


Note.—Numbers in parentheses indi 
A-Therapeutic prognosis. 
B-Degree of anxiety. 
C-Defensiveness and rigidity. 
D-Personal feelings, 

E-Interest in taking into treatment. 
* Significant at .05 level, 
** Significant at .01 level, 


evaluated by each the 
32 with a median nu 
of the patients wer 
plied for outpatie: 
been recommende 
the initial screeni 


» With a median 
In terms of diagnosis, the 
group was as follows: Psychoneurosis, 16; 
Personality Disorders, 14; Psychosis, 3; and 
other diagnoses, 5. 


REsuLts 


The responses secured from 
were tabulated for each categor 
In order to evaluate the reli 
ratings, average intercorrelations were com- 
puted on the five raters who had seen at 
least 16 patients in common. Two of these 
raters were staff psychologists, one was a staff 
psychiatrist, and two were psychiatric resi- 
dents. Ebel’s (1951) technique for estimating 
the reliability of ratings was used. The judges 
showed a high degree of agreement in their 
ratings of therapeutic prognosis (r = .88), 


the therapists 
y of response. 
ability of the 


cate number of cases when they vary fror 


M that indicated in Column 2. 


Personal feelings toward the patient (7 = Ta 
interest in taking the patient on for ır 
apy (r = .80), and patient’s anxiety eee in 
= .88). Moderate agreement was eae fen- 
the judges’ estimates of the patient's €¢ 
Siveness (r = .68), P ie 
The ratings obtained were then inter 
related where appropriate. Thirteen iher in 
who rated at least 12 patients were bes o 
this analysis, which forms the first ptings 
our report. We shall discuss now a ation- 
of therapeutic prognosis and their rel 
ship to other judgments, 


. i o 
_In line with other findings, it was E 
Sized that degree of anxiety would de- 


related positively with prognosis whermivelY 
fensiveness and rigidity would be neg? orr, 
Correlated with 


rognosis (Rubinstein ere 
1956; Taulbee, 1958), a predictions "ed 
generally Supported, although not to 4? 1 the 
degree. As can be seen in Table 1; î^ nosis 
Correlations between therapeutic PIT Jess 
and degree of anxiety are positive, signi” 
than a third of them are statistically © 5 o 
cant, with the median being .38. Rav onsbiP 
anxiety thus bear only a limited rela 


the- 
cor 


— 
i a 
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Judgments concerning Patients for Psychotherapy 


lo ratings of prognosis. The relationship be- 
tween prognosis and defensiveness appeared 
to be somewhat more marked with approxi- 
mately half of the correlations approaching 
significance. As might be anticipated, this re- 
lationship was negative in all but one case. 
Apparently, our judges react somewhat more 
strongly to defensiveness and rigidity in re- 
lation to prognosis than they do to anxiety in 
this regard. 

Ratings of prognosis, on the other hand, 
were highly correlated with positive feelings 
of the judges toward the patient. Only one of 
the correlations was not significant at least at 
the .05 level of confidence, with the median 
correlation being .66. The personal feeling of 
the raters thus appears most closely related 
to ratings of prognosis, or vice versa. Ratings 
of “interest in taking the patient into treat- 
ment” were also highly correlated with both 
ratings of therapeutic prognosis and the per- 
sonal feelings of the raters toward the pa- 
tients. The latter finding suggests that per- 
sonal feelings toward the patient, interest in 
taking the patient on for therapy, and judg- 
ments of prognosis may be manifestations of 
the same positive view of the patient. One 
cannot state whether the raters “like” patients 
with good prognosis, or whether a good prog- 
nostic rating is given to patients that the 
therapist reacts to personally in a positive 
fashion. 

The other findings were not as marked, al- 
though there was some negative relationship 
between the personal feelings of the rater and 
defensiveness of the patient. It would thus ap- 
Pear that judgments of prognosis are most 
Closely related to the personal feelings of 
therapist judges (or vice versa), and that the 
latter bear more relationship to judgments of 
defensiveness and rigidity than they do to 
judgments about the patient's anxiety. This 
Pattern generally is congruent with that 
recently reported by Strupp and Williams 
(1960). In studying two therapists, they 
found that “nondefensive, insightful, likable 
and well-motivated patients were seen as most 
likely to improve in psychotherapy” (p. 440). 


ASSETS FOR PSYCHOTHERAPY 


As mentioned previously, each rater also 
Was asked to list the therapeutic assets for 


n 
> 
~~ 


TABLE 2 


PATIENT Assets LISTED ror PSYCHOTHERAPY 
Asset Frequency 
Intelligence 138 
Anxiety-discomfort 112 
Motivation 98 
Age 49 
Insight-awareness of problem 49 
Past adjustment 29 
Ability to relate 20 


each patient as well as to indicate likely prob- 
lems to be encountered in psychotherapy. A 
total of 532 responses pertaining to patient 
assets were obtained with a variable number 
being listed for any given case. The average 
number per patient was one and a half. After 
a preliminary analysis was made of all the 
individual responses, the results were grouped 
into appropriate categories. Although a very 
large number of responses were listed, these 
could be classified with little difficulty into a 
relatively small number of categories. All of 
the items which were mentioned at least 10 
times are presented in Table 2. 

As noted in Table 2, three categories make 
up over half of the listed assets, i.e., intelli- 
gence, anxiety, and motivation. When age and 
insight are added, these five account for over 
80% of all the assets listed for these patients. 
On the basis of such ratings, one might infer 
that the average therapist prefers a patient 
who is intelligent, anxious, well motivated for 
therapy, young, and with some insight into his 
difficulties! This seems to be borne out by an 
analysis of the assets listed for patients in re- 
lation to the ratings by our judges of personal 
feelings toward the patients. When the total 
group of patients is dichotomized in terms of 
the median ratings on this variable, it is noted 
that the group which receives the higher rat- 
ings also receives almost twice the frequency 
of listed assets. This difference is significant 
at the .01 level of confidence (x? = 15.42, df 
= 1). With the exception of age, the assets 
mentioned are linked more frequently with 
patients given high personal preference rat- 
ings by the therapists. The preferred therapy 
patient, as inferred from these listings by our 
sample of therapists, bears a close resem- 
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TABLE 3 


Mean Mean 
of of s 
ha) I 
Scale T R 
3 28 
Defensiveness he 1B 28 
seed 3.00 2.64 1.88* 
re sis 88 
A A feelings 2.64 pa pa 
Interest in taking 2.69 2.52 .87 


* Significant at .05 level of confidence, one-tailed test. 


blance to the type of preferred patient men- 
tioned in the research by Hollingshead and 
Redlich (1958). 

It is of interest also to comment on the 
variability with which the various assets were 
listed by different therapists. Intelligence, for 
example, was listed in one out of eight cases 
by one person, but in almost two out of three 


e therapist, but in only 
1 case out of 16 by another, Thus, while there 


concerning desirable fea- 
tures in a Psychotherapy patient, there js 
some variation among therapists concerning 
the frequency of emphasis on certain aspects 


ur data are too meager at 


THERAPISTS’ JUDGMENTS anp Duration or 
STAY IN Psycuorurrapy 


Of the 38 patients who 
tially at the outpatient staff conferences, 24 
were assigned to a therapist in our outpatient 
clinic, thus allowing for some follow-up study, 
All of the therapists who saw these patients 
participated in the initial rating procedures. 
The other 14 patients were referred to other 
agencies, clinics, or hospitals. In a few of 
these cases, no treatment was recommended, 
The median number of interviews kept for 
the group of patients assigned to therapy here 
was 17. (This atypically high figure may be 
somewhat misleading. There were 11 patients 


were evaluated ini- 
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who kept 12 or less interviews, a value i 
is the same as that previously reported as we 
median on a much larger sample of patients— 
Garfield & Affleck, 1959.) 

It was hypothesized that ratings of low 
defensiveness, high anxiety, good BiOgnesy 
positive personal feelings, and a positive na 
terest in taking would be related to greate 
duration of stay in psychotherapy. ie 
ences between the mean ratings of all sca 9: 
for patients above and below the median e 
analyzed, Each patient was rated by a media 
of 9 raters, with a range of 6 to 14 raters. 
Table 3 presents the results of these analyses: 

The only rating which was significantly ei 
lated to duration of stay was prognosis; PA 
tients remaining in therapy longer ang, a 
initially as having a better prognosis. ni 
Spite the moderate intercorrelation of 4 
nosis with the personal feelings of the a 
pist, ratings of the latter were not sanaca 
related to duration of stay. The failure of re 
ings of personal feelings, interest in begs 
the patient on for therapy, anxiety, and of 
fensiveness to predict duration of stay 1s a 
interest in the light of our previous ae 
on therapists? Preferences. Tentatively, it 
Pears that the set therapists develop aad 
patients on these dimensions are reliable, ra- 

ave no predictive validity as regards du > 
tion of stay, While interest in taking nines 
tient on for therapy and the personal feeli ee 
of the therapist toward the patient were a 
nificantly correlated with ratings of eget 
and thus Suggestive of a common view A to 
patient, only the latter appeared relate 
duration of Stay in psychotherapy. 


SUMMARY AND CONCLUSION t 

: gree- 

Our findings show a high degree cee 
ment among therapists in terms of judg 


: ients, 
of Prognosis, personal feelings toward Pat cho- 
and interest in taking patients on for PS 


how 
therapy, Ratings of these variables also $ hat 
mod 


erate intercorrelations. This suggests cd 
certain patients have a high valence for thera- 
pists and that there is agreement among some- 
Pists as to who these patients are. ae the 
what similar Consensus was evident 1 


a rapy 
listing of patient assets for psy ae 
where five assets constituted 80% of t q the 
listed, Į 


f n 
n terms of the assets listed 2 
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Judgments concerning Patients for Psychotherapy 


negative feeling expressed by therapists to- 
ward defensiveness in patients, it would ap- 
pear that a positive reaction is expressed to- 
ward the patient least difficult to work with 
and, possibly, the person least in need of 
skilled help. It was further demonstrated that 
patients who evoke positive feelings from 
therapists are characterized by those thera- 
Pists as having significantly more assets, par- 
ticularly intelligence, motivation, anxiety, and 
insight. 

When the ratings were related to actual 
duration of stay, it was found that patients 
remaining in therapy longer were rated as 
having a better prognosis. None of the other 
ratings were significantly related to duration 
of stay. While therapists show high agree- 
ment in their preferences and personal feel- 
ings for patients, these ratings were not re- 
lated to actual duration of stay. 
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AN EMPIRICAL SCALE OF THERAPIST VERBAL ACTIVITY 
LEVEL IN THE INITIAL INTERVIEW? 


EDMUND S. HOWE anp BENJAMIN POPE 


University of Maryland School of Medicine 


Subjecting the psychotherapist to examina- 
tion as an independent variable reflects ac- 
ceptance of the proposition that, regardless of 
his theoretical orientation, what the therapist 
says is of central importance in the thera- 
peutic transaction, Subjecting him to similar 
examination in the initial interview implies 
that the therapist’s mode of verbalization 
may have an important bearing upon achieve- 
ment of his diagnostic or other goals, 
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(e.g., Deutsch & Murphy, 1955; Finesinger, 
1948; Gill, Newman, & Redlich, 1954) at- 
tempt to arrive at some kind of working diag- 
nostic formulation during an initial interview 
not by eliciting a mass of factual information 
about various sectors and stages of the pa- 
tient’s life history: but instead by following 
the patient’s own leads, his sequential ac- 
count of himself, his life, and his muse 

These transitions in the form of the initia 
interview can be described in terms of increas- 
ing adoption of the projective interview, 1™ 
which it is now commonly accepted that one 
is apt to discover more information of a reia 
vant nature either by remaining silent, Or i 
most by asking rather vague, nonleading qo 
tions onto which the patient may project a 
own referents, and his own interpretation y; 
what is “meant.” In this way one learns muc 
not only about circumstantial (factual) na 
terial, but also about those contiguous bee 
tional and associational processes which us 
ally lie nearer to the heart of the nate 

The foregoing developments have given = 
to the concept of Therapist Activity i 
(eg. Finesinger, 1948) with the attend? 
implication that’ lower Activity Levels vf 
Potentially more advantageous than a 
ones, for the purposes of gathering relev o 
information, fostering the development ift 
transference reactions, and avoiding 4 * the 
into a social or Personal relationship wi 
Patient. (There are, however, obvious €% 
tions to 


Ste 
higher levels of activity for such suppor nt 
Purposes as encouraging the inhibited se i, 
to talk during an initial interview [Gil E 
154] OF to-ptevene acting-out behavior? |. 

These commonly assumed benefits © emai? 
taining ą low level of verbal activity neve 
hypothetical, however, since they have Trepis 
een subjected to experimental scrutiny: 


any general rule, such as the 
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Therapist Verbal Activity Level 


Paper constitutes a preliminary basic step in 
@ research program the aim of which is to 
evaluate the role played by, and the impact 
upon the patient of, the therapist’s Activity 
Level in the initial interview. The experiments 
to be reported at this time were performed 
(a) to examine the rateability of the concept 
of Activity Level in terms of three assumed 
attributes (to be discussed below): (b) to 
develop an Activity Scale for subsequent 
Measurement procedures; and (c) to explore 
Some of the empirically controllable variables 
that might affect the reliability of application 
of such a scale to actual interview material. 

The choice of a definition of Activity, how- 
ever, presents a problem, for its attributes are 
not clear, and have never been spelled out. 
Deutsch and Murphy (1955) for example, 
made no attempt to define Activity, other than 
implicitly by rejecting the question-and-an- 
swer interview pattern, and instead proposing 
a “process of facilitation through the selective 
repetition in interrogative form of the pa- 
tient’s remarks” (p. 18). Finesinger (1948) 
likewise skirted the conceptual problem of 
definition in expressing his preference for 
Activity which is kept “as low as is consistent 
with the attainment of therapeutic plans and 
goals” (p. 192). 

Several research workers (e.g., Bordin, 
1955; Dibner, 1953; Osburn, 1951) have 
accepted the term Ambiguity as a significant 
aspect of therapist behavior. Dibner in fact 
showed that certain consequences in the pa- 
tient’s behavior (e.g., increased “anxiety”) 
follow greater therapist Ambiguity. Bernstein, 
Lennard, and Palmore (1958) likewise ob- 
Served greater “ease of communication” by 
the patient following greater therapist Speci- 
ficity (i.e., less Ambiguity). Several years 
earlier Snyder (e.g., 1945) investigated Lead, 
Which he assumed to be a primary dimension 
Of therapist verbal behavior. (Indeed, it is 
interesting to note that when Freud [1948] 
himself abandoned hypnosis in favor of the 
Psychoanalytic technique, he contrasted the 
Suggestive nature of the former with the non- 
€ading character of the latter.) Finally, it is 
Considered that a therapist response may also 

© looked at from the standpoint of the de- 
Bree of Inference which it carries, or which it 
“Onveys to the patient. In the studies to be 


511 


described these three attributes, Ambiguity, 
Lead, and Inference, will be used to charac- 
terize what is meant by variations in Activity 
Level. It was assumed for the purpose of 
these studies that the three attributes are 
moderately (if not highly) intercorrelated, so 
that the three terms are to some extent in- 
terchangeable. Thus, Ambiguity subjectively 
feels as though it would be negatively corre- 
lated with Lead and with Inference, whereas 
the last two would be positively correlated 
with each other. To this extent Activity is 
assumed, for present purposes, to be one- 
dimensional. 


METHOD 


Siudy 1. A broad variety of over 20 published 
psychotherapy interviews involving different types of 
patients, different phases of treatment, and thera- 
pists of different theoretical allegiance, were used as 
source material to compile a representative sample 
of 50 abstract descriptions of therapist verbal re- 
sponses. Thirty Board-certified psychiatrists rated 
each of these descriptive responses (presented on 
individual 3 X 5-inch cards) for Activity Level along 
an 11-point scale. A broad working definition char- 
acterized Activity Level as follows: 


A high-active response from the therapist is not, 
of course, necessarily one which has greater length. 
It does, however, have relatively low Ambiguity 
about it; it involves a marked degree of Lead by 
the therapist; and it carries a high degree of In- 
ference. Conversely, a low-active response is highly 
ambiguous; it manifests a low degree of Lead by 
the therapist; and it carries a low degree of In- 
ference. Thus, compare the following three de- 
scriptive responses: 


1. Therapist gives a general, unfocussed invita- 
tion for the patient to talk. 

2. Therapist asks the patient to describe the last 
occasion when a pattern of symptoms occurred, 

3. Therapist explores the patient's feelings about 
something just reported by the patient, 


Going from 1 to 2 through 3, the responses be- 
come less ambiguous, they show progressively more 
lead, and they connote an increasing degree of 
inference... . j 


Each rater also sorted a duplicate set of cards 
into one of three groups: (a) responses primarily or 
mainly diagnostic in purpose (ignoring secondary, 
therapeutic value); (b) responses primarily or 
mainly therapeutic in purpose (ignoring secondary, 
diagnostic value); and (c) responses fitting neither 
category. The two tasks were given in one of two 
sequences to alternate subjects, 

Resulis of Study 1. A Lindquist (1953, pp. 267- 
273) Type I analysis of the 1l-point rating data 
established (a) an overall difference among the 50 


512 
Ul 
10 
9 
J 
8 — 
4 \ 
u7 | 
> 
W 
1 6 
> 
E 
2 
H} 
0 s--t MEANS OF INDIVIDUAL DESCRIPTIVE 
4a RESPONSES FROM ORIGINAL RATING STUDY 


oO:MEDIANS OF THESE MEANS 


FAC.=SIMPLE FACILITATION RESPONSES 
EXP.=EXPLORATORY RESPONSES 
LARIFICATION RESPONSES 
NTERPRETATIVE RESPONSES 

l APP/REA PPROVAL AND REASSURANCE 
PERS. =ATTEMPT TO PERSUADE 


FAC EXPL CLAR 


INT APP/ PERS 
REAS 


Fic. 1. Mean and median Activity Levels of 35 
therapist responses placed (with significant statistical 
agreement among 10 subjects) into five conventional 
categories of therapist behavior. 


item means (p < .001); (b) a nonsignificant Se- 
quence main effect; and (c) a nonsignificant Se- 
quence X Items interaction effect (p> .05). Since 
the Activity ratings were thus clearly not altered as 
a result of prior judgments of diagnostic vs. thera- 
peutic value, the rating data were then pooled. The 
interclass 7 was .50; the reliability of average rat- 
ings, .93 (Guilford, 1954). 

After computation of appropriate chance fre- 
quencies via Fisher’s Exact test (Siegel, 1956) it was 
established that the 30 subjects significantly agreed 
upon only 10 of the responses as being primarily of 
“treatment value,” and upon only 7 as being pri- 
marily of “diagnostic value.” A comparison of the 
mean Activity Levels of these two groups of re- 
sponses via the Mann-Whitney U test (1947) showed 
the responses in the treatment category to be more 
active (p < .001). This accords with commonsense 
expectation, and constitutes a modicum of face va- 
lidity for the working concept of Activity. 

Study 2. This was undertaken to examine the re- 
lationship between Activity Level as rated in Study 1 
and five conventional labels frequently applied, in 
the contemporary literature on psychotherapy re- 
search, to various categories of therapist operations. 
Ten new subjects, four psychiatrists and six clinical 
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TABLE 


Mean Activity LEVEL A gp 35 Tuerarist RE- 


SPONSES BY PSYCHIATRIS AND BY MEDICAL 
Srupents with Hicon axp Low INTEREST IN 


PSYCHIATRY 


Mean o 


Type of Subject N 
: i este eee 
Psychiatrists 15 6.17 3.03 
Students with 
high interest 19 oH 3.02 
Students with - 
low interest 18 7.08 Sulit 
k i punon 


psychologists with significant therapeutic experience, 
sorted the set of 50 therapist responses into the fol- 
lowing six categories: Simple Facilitation; Explor’ 
tion; Clarification; Interpretation; Approval ane 
Reassurance; and Unclassifiable. Subjects were given 
a broad working definition of cach category. a 
example, Simple Facilitation responses were define’ 
as “quite unfocussed responses designed to get the 
patient talking, and to keep him talking, without 
imparting any direction to him whatsoever.” 
Results of Study 2. Application of Fisher's Exact 
test (Siegel, 1956, Table D) to each category showed 
better than chance agreement among subjects 0? 39 
of the 50 responses, only 4 of which were in the 
Uncelassifiable category, A Kruskal-Wallis (1952) 
one-way analysis of variance of the Activity Levels 
of the 35 responses within the five conventional catt- 
gories yielded an H of 22.79 (with df =4, P< 001); 
Figure 1 shows the data plotted, Activity Level 
medians against type of conventional category. The 
relative order of magnitude of median Activity Level 
for the categories largely accords with subjective 
commonsense expectation, The “Persuasive” respons 
which as one might expect is rated most Active, W25 
not actually included in Study 2, since it was the 
only one of its kind in the original set of 50 re- 
sponses. It is presented in Figure 1, however, for 
the sake of perspective and completeness overt the 
entire range of therapist operations actually studied. 
It is noteworthy that there is considerable overlap 
between adjacent types of conventional categories ° 


TABLE 2 


RANK ORDER CORRELATIONS BETWEEN tHe ACTIVITY 
RATINGS or 35 Tueravist Responses by Psyc} 
TRISTS AND BY MEDICAL Stupents wira HIGH 

AND Low INTEREST IN PSYCHIATRY 


Types of Subjects rho 
dar 
High interest vs. psychiatrists .867*** 
High interest vs, low intere Te 
Low interest ys. psychiatrists .030"** 
2 5a mere 


* p <.001. 
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TABLE 3 


PARALLEL Activity SCALES A AND B, ANp MEAN ACT 


wiry Lever or Eacn Parrep Ir 


ale A 


Seale B 


Descriptive responses 


Mean AL 


a single word or syllable 
atient an invitation to con- 


to give the | 
tinue. 


— 


3.3 


Therapist repeats exactly what the pa- 
tient said except for random changing of 
one or two words. 


ae 


4.3 Therapist states a question or incomplete 
sentence which contains a key word or 
phrase from the patient's previous re- 


sponse, 


58 Therapis s for “an example” of what 
the patient has just reported. 


6.2 Therapist focuses upon an objective, i 
factual aspect of patient's life (age, Job, 
salary). 


ates things the patient has 
ke the im- 


67 “Therapist re 
said, in a different way, to ma 
port clearer. 


Ar, ee A 


nt how he feels about 


74 Therapist asks patici i 
Shere aan nt which the pa- 


something or some eve 
tient has just talked about. 


atie vith a 
8 ‘Therapist confronts the patient meee 
reformulation of things the Py eae 
said, and asks if that is what he means 


ression that 


Therapist conveys his imp 
i r from the pa- 


there is something missing 
tient's story. 


i atient 
sts that what the patie 
a tent with certain 
the patient. 


Therapist sugg : 
has just said is inconsis 
nN other things said earlier by 


wd 
2 ee , 
F 

oo 

uw 


Di apist re- 
Oh; ventional 
tego? and a meaningful order of conven 
The rics 


5 A ion 
along ned dimensior 
g the assum me for analy 


ciples of focus 


go 


Sy, Singe: 
ty, Uq. er (1948), ý - of the 
“earch 3. One of the ultimate applied goh g he 
Program was to study therapist verba 


Descriptive respor 


1.6 Therapist says “Hm-hm” to convey ac- 
ceptance and understanding of the pa- 
tient. 


Therapist makes a verbal response of two 
or three words, given as simple accept- 
ance and understanding of what the pa- 
tent says. 


At ‘Therapist the patient to tell him 
more, to elaborate a little, on a topic al- 
ready mentioned. 


n 
zi 


Therapist parries a question put to him 
by the patient, by directing it back to the 
patient. 


6.3 Therapist inquires how long patient's 
symptoms have been present. 


6.7 ‘Therapist when some event (de- 
scribed by the patient) actually happened. 


7.5 Therapist question focuses upon patient's 
transient thoughts within the interview 
situation, at a particular instant. 


7.8 Therapist reflects a feeling or need clearly 
implied in the patient response, but not 
actually verbalized by the patient. 


a 


Therapist summarizes a number of difer- 
ent responses made by the patient, which 
are essentially concerned with the same 
feeling, of which the patient is aware, and 
therapist labels the feeling. 


9.3 Therapist points out some reality condi- 
tion which is inconsistent or incompatible 
with the patient's wishes or expectations 


havior in the initial, rather than in the ‘treatment 
interview. Since initial interviews tend usually BOL 
to involve the more active types of therapist opera- 
tions (egs interpretive), 25 of the most active n 

ses were removed from the original set of 50. 
ike remaining 25 responses, 11 more were added. 
H i oe set of 36 responses was rated, as before, 
iig i 11-point scale of Activity, by three groups 
— it One group consisted of 15 of the original 
= SURE used in Study 1. Two other groups 
30 ps a 
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TABLE 4 


M ACTIVITY LEVEL AND STANDARD DEVIATION 
MEAN + 


yor Five INITIAL Interviews (Stupy 4) 


Mean 
Activity 
Level 
per re- 
sponse 
Number over 
of entire 
Theoretical therapist inter- , 
orientation responses®* view SD 
ogers 48 6.2 1.15 
EN 58 5.9 1.33 
Wolberg 95 5.8 1.68 
Gill 150 5.2 2.04 
Finesingerian 62 4.4 a 


a A maximum of two initial or terminal responses (e.g, a 
greeting) in each interview was not rated. 


were drawn from a class of 100 freshmen medical 
students. One group of 19 subjects had previously 
expressed very high interest in ultimate specializa- 
tion in psychiatry, while the other group of 18 sub- 
jects had expressed very low such interest. Inclu- 
sion of medical students with high and low interest 
provided a check upon the independence of Activity 


ratings from psychiatric experience and sophistica- 
tion. 


students; the reliability 
for all three groups (Gi 
are almost identical wi 
Indeed, the value of 
Levels of the 25 responses common to Studies 1 
3 was .945 (p< 001). The data indicate that reli- 
ability of the rating procedure is but little altered 
by psychiatric interest and experience, 

Consequently, data from the psychiatris 
in Study 3 were used to form two parallel Activity 
Scales. These are presented in Table 3. Each 


and 


t subjects 


ordinal 
pair of items was matched on the basis of virtually 
identical mean Activity Levels and of nonsignifi- 


cantly different variances. 

Study 4. This study was performed to make a pre- 
liminary test of the reliability and discriminatory ca- 
pacity of Scale A. The authors independently rated, 
in context, each therapist response in five unfamiliar 
published initial interviews. These, chosen for their 
divergence of theoretical adherence, were performed 
by Wolberg (1954, pp. 690-699); Deutsch (Deutsch 
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& Murphy, 1955, pp. 29-49); Skinner (Finesinger & 
Powdermaker, undated); Gill (Gill, ct al., 1954, pp- 
134-204), and Rogers (1947, pp. 128-142). Since the 
Finesinger and Powdermaker interview was actually 
performed by a close adherent to the Finesinger tech- 
nique, all subsequent reference to this interview will 
be via the term “Finesingerian.” 

Results of Study 4. Reliability of scoring the five 
interviews was .90 or better. Table 4 shows mean 
and ø of each interview for one of the two raters. 
The mean values differ from each other both by 
Fisher's F and by Kruskal-Wallis’ (1952) H (p 
<.001). This result supported the assumption that 
the Activity Scale samples a meaningful common 
variable in therapist verbal behavior, and hence 
justified a more powerful and elaborate reliability 
study. es : 

Study 5. This was undertaken (a) systematically 
to assess the range of reliability estimates obtained 
when professional but untrained raters apply the 
Activity Scales to unfamiliar printed interview ma- 
terial; and (b) to study the empirical equivalence 
(i.e, the interchangeability) of the two parallel Ac- 
tivity Scales (see Scales A and B, Table 3). 

Eight raters consisting of four clinical psycholo- 
gists at the PhD level and four psychiatrists having 
between 2 and 4 years of experience were used in 2 
modified latin square study adapted from Cutlets 
Bordin, Williams, and Rigler (1958). For each k 
four interviews the subject rated successive therapls 
responses seriatim for Activity Level, using eima 
Scale A or Scale B, Each scale was used with a dif- 
ferent pair of the four interviews presented to pre 
subject. In order to control for the possibility tha 
ratings of the therapist responses might be infiuenss 
by the Succeeding response from the patient, one 
the therapist responses were presented for two et 
the interviews rated by each subject (the (Con 
Absent” condition) ; whereas for the other two inte 


TABLE 5 
EXvERIMENTAL DESIGN or STUDY 5 


Scale A Scale B = 

—. ena zt 

Context Context Context Goni 

Rater Absența Presente Absent Prese” 

1. PhD Ivete M3 I4 r 
2. MD Iva Ws T:4 ke 

3. PhD  I:4 pyea Tl ra 

4 MD M:4 IV: PL a 
5. PhD 1.9 I4 Iv:3 Uei 

6 MD 1:2 I4 IV:3 we 

7. PhD fts wi me DVF 
8. MD Ts ñi Ma Mi 


were 
u á ses Vat 
not ;Otitext Absent” implies that the patient's resPOpjies U 
the resented to the subject; "Context Present” } 
hey were SO presented. 9 (8 textine 
man numerals refer t, articular interview Mion to in- 
bjeabic numerals refer to tee wea of presenta nent ee 


subject, of Sag t of treat! 
bination: a specific interview and a particular 
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‘TABLE 6 


ANALYSIS OF VARIANCE 


Source of variance SS af MS F 
Sequences 53 177 <1 
Raters within sequences St 4 -210 gi 
Total between raters 1.37 7 
Order 1.01 3 337 <1 
Interviews 17.10 3 5.700 9.97%"* 
Experimental conditions 3.89 3 1.297 2.20 
Context .28 1 -280 <1 
Scales 3.30 1 3.300 5.77* 
Context X Scales 31 1 310 <1 

Pooled error 8.58 15 572 

‘Total within raters 30.58 24 

Grand total 31.95 31 

* p <.05. 
wet p <.001. 
TABLE 7 2 


tocol was presented (the 


views the entire typed pro 
). The Sequence of treat- 


“Context Present” condition 
ment combinations presented to each subject, the 
Order in which a given interview was rated, the 
Context variable, and the Scale variable were sys- 
tematically varied and controlled in the design shown 
in Table 5. It will be noted that four pairs of sub- 
jects (one psychiatrist and one clinical psychologist) 
were treated jdentically with one of four sets of 
treatment combinations. 

The four interviews were selected from those used 
in Study 4. They are hereafter referred to as Inter- 
views I (Wolberg) consisting of 73 therapist re- 
sponses; ? II (Gill), 73 responses; III (Finesinger- 


2 It was necessary to ignore certain types of thera- 
pist responses (two from each interview) such as an 


initial greeting or a farewell. 


ALL MEAN Activity LEVEL AND SIGMA ASSIGNED 
`H OF Four INTERVIEWS BY EIGHT RATERS IN 
Srupy 5 


Mean 
No. Interview ALa ne o 
I Wolberg 61 73 1.45** 
IL Gill 5 73 17.14% 
TII Finesingerian 4.9 62 1:10** 
IV Rogers 6.7 48 Ga 


5 Based upon the pooled ratings of eight subjects. 
$ Number of therapist responses rated. 
The variances are all significantly different from each other 


(p <.01). 


TABLE 8 


ERRATER RELIABILITIE: 


AS A FUNCTION OF PROFESSIONAL SPECIALTY 


Among PhD’s 


Among MD's Among all Subjects 


No. Interview Range Median* Range Median! Range Nediane 
I Wolberg .66-.76 Ad 56 .46-.79 62 
Il Gill -72-.84 78 50 40-85 73 
TIT Tinesingerian —.09-.60 34 34 eas 35 
IV Rogers .00-.64 38 36 — 05-83 29 


Note.—Values of N are 
-23 for Interviews I and I; 
TIT; and .37 for IV.. 
The median reli; 
than those of Interv! 
„^ Each median val 
Subjects" group. 


.25 for IIT; and .29 for IV. 


IL and IV. 


73 for Interviews I and IL; 62 for IIT; and 48 for Interview IV. The mini i 
Foe tamara A Tor which p LOLs. A0 for Inteeglen ey AE Scone 
s I and TI; 32 for 


ilities of Interviews I and II are higher, both in the PhD group and in the “AI subjects” group (p < .01) 
x group (p <.01), 


lue is derived from a population of 6 r's for the two “specialty” groups, and from 1 of 28 r's for th n 
g r's for the ‘al 
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TABLE 9 
INTERRATER RELIABILITIES AS A FUNCTION OF EXPERIMENTAL CONDITIONS 
Interview number 
Combination of I II Ill IV nnd 
experimental = : ae 
; conditions Range  Median* Range  Median® Range Median® Range Median* 
Same scale, = E 2 = 
same context .58-.79 .69** .41-.85 -76 36-.64 Al .01-.68 35 
Different scale, an : : 
different context 54-74 -60 40-87 -69 —.14-.63 37 {0-6 35 
Same context, . 
ifferent scale 40-74 64 48-84 71 09-59 3 =5=:83 i 
+ 
Same scale, . . 
different context 49-76 FY) “2-79 T4 .19-.65 34 Asai „44 
o eee a 
Note.—"'Same context” implies that the two subjects f vhose da sith patient con- 
text presen or with patient context absent; “diffe beontane aE N S vith patient 
context present, the other member with patient context absent. 


ving si 


* The median values for Interviews I and ITI, line 2, 
comparisons. 


ian), 62 responses; and IV (Rogers), 48 responses, 
In order to keep the subject’s task within a reason- 
able time limit, only the first 73 responses of inter- 
views I and II were used, while the other two in- 
terviews were used in their entirety. The total time 
taken for the four tasks varied from 1 to 3 hours 
per subject. 
Results of Study 53 The analysis of variance is 
own in Table 6. The most clear-cut and crucial 
fact established by the analysis is that the mean Ac- 
tivity Levels of the four interviews significantly 
differ among themselves (b < ..001). These mean 
values are presented in Table 7. The rank order of 
the four interviews accords exactly with that found 
in Study 4. It is noteworthy that the Rogerian in- 
terview (IV) turns out to be the most active and 
the Finesingerian one the least active. The sole other 
significant effect is for the Scale variable (p < 05) 
This accords with an a priori hunch, but it a EK 
theless potentially somewhat disturbing; for it will 
furthermore be seen later that for three of the inter- 
views, Scale A manifests greater reliability than does 
Scale B. While the context variable turns out not to 
be significant (which is as it should be), it should 
be noted, for the present, that this finding refers only 
to overall mean values for each interview, j 
The between-rater reliabilities for each interview 
were computed by IBM, yielding a total of 4(8 X 7)/ 
2 = 112 reliability coefficients (Pearson 7’s). One sum- 
mary of these, presented in Table 8, breaks the data 
down into a PhD group (clinical psychologists) and 
3 Michael S. Black, now of the University of Illi- 
nois, performed most of the tedious office computa- 
tions in Study 5. 


is never- 


r's in th 


+ are significantly different ($ <.01). 


oth rated esther 


le 8. zhaii 
i and from one of cigh 


š her 
See text for comments On oF 


an MD group (psychiatrists). The PhD group ehaws 
nonsignificantly Sreater median within-group agren 
ment * than does the MD group on Interviews I ane 
II. The same two interviews were rated more Pn 
ably than Interviews III and IV, by both the 5 at 
group and the “all subjects” group (p < 01). ee 
the Rogers interview (IV) elicited low rater ices 
ment was not at all surprising, since many RoE ip 
responses do seem extremely difficult to match Ta 
scale items, and considerable argument was vE 
by several subjects that their ratings of this mally 
View were subjectively most unreliable. The equ! ; 
low reliability of the Finesingerian interview tory 
owever, was rather surprising, and no satisfac 
explanation of this finding is forthcoming. 


Table 9 Presents the median and range O 
rater r’s as 


conditions, 


f inter 


j imental 
a function of the various experim 


+ apilit 
À Generally speaking, the highest reliabiliy 
coefficients are obtained when comparisons a° ade 
with the same scale, and lowest when they are mii- 
With ratings based upon different scales; but the She 
ferences do not achieve conventional significance iey- 
reliabilities of both Interviews I and II are ©? 


: a jews 

im, arithmetically larger than those of internat 

I and IV within all four experimental treat ari- 
e comp’ 


combinations, Only 1 of the 16 possibl je 72 
e yields a significant value of p (sce | 3 value 
evel of significance footnote), while the mediar, text 


of b for the set is about .13, The effect of the ta pre 
Variable is slight when assessed from the ©) ure 
sented in Table 9, r 


But a somewhat differen 
ee 


in 

ted? 

i All comparisons of rs subsequently repor ith 
the Paper were made, unless otherwise el 
© Median test (eg, Siegel, 1956, 111-116): 
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is presented in Table 10, where the reliabilities are 
examined within pairs of raters (one PhD and one 
MD) each of the two members having been treated 
identically. For Interview I the Context variable 
produces fairly consistent correlations within pairs of 
subjects using either Scale A or Scale B. For Inter- 
view II, however, the correlation is greater for Scale 
B condition when the patient context is present than 
when it is absent ($ <.05). For Interview III the 
latter finding holds for both scales but does not 
achieve conventional significance. 

A surprising finding for Interview IV is that when 
the patient context is present the reliabilities drop, 
to near zero for Scale A (p> .05) and by 50% for 
Scale B (p = .06)! This type of finding was reported 
also by Cutler, Bordin, Williams, and Rigler (1958) 
whose. analyst-fledgling subjects agreed significantly 
less in ratings of Depth of Interpretation when they 
had patient material available to them. In the pres- 
ent study this finding is taken to reflect (again) the 
difficulties involved in rating the Rogerian material. 

A comparison of all y's involving Scale A with 
complementary r’s involving Scale B (Table 10) 
shows that for Interviews I and II, consistently 
greater within-pair agreement (p < 02) is obtained 
with Scale A. The same comparisons for Interview 
III fall short of significance, although they too are 
in a consistent direction. For Interview IV (Rogers), 
on the other hand, exactly the opposite outcomes are 
observed, of which one is significant (p < .05) and 
the other nearly so (p =.06). 


Discussion 


The empirically derived Activity Scales 
facilitate ratings having average reliability 
which is moderate (.51) for untrained raters 
(Study 5) and very high (.91) for well- 
trained raters (Study 4). The Activity Scales 
satisfactorily discriminate among the inter- 
views employed. In Study 5 however, dif- 
ferences among the reliabilities of the four 
interviews are considerable; the values for In- 
terviews III (Finesingerian) and IV (Rogers) 
being considerably lower than the estimates 
for the other two. The Rogers interview in 
addition leads to two unexpected discrepancies 
requiring discussion. 

A problem facing al 
ment with ratings of t 
ior concerns the selection of ie 
natural, defensible tendency is to optam the 
services of highly trained “experts. This, of 
Course, raises serious questions of practical 
availability, since the expert not only has less 
time to donate to research workers, but he is 
also in much shorter supply than the non- 
expert. While in Study 5 subjects with the 


l of those who experi- 
herapist verbal behav- 
f subjects. The 


TABLE 10 
CORRELATIONS WITHIN EACH PAIR or Rw UNDER 
NTAL 


Scale A Scale B 


Patient Context Patient Context 


Number of 


interview Present Absent Present Absent 
I 78 79 58 59 
IL 85 84 -67 Al 
Ill „64 .40 2 -36 
IV OL 30 0 .68 


is based upon a unique combination of vari- 
i v. Professional specialty (i.e., clinical 
confounded with the Sequence, 
ables in all 16 indices. 
imal values of r achieving sig- 
nce, see the general footnote to Table 8, 

Values for Scale A are larger (p < .02) than those for Scale 
B within Interviews I and lI. Values for Scale B are larger 
(p < .06) than those for Scale A in Interview IV. In Inter- 
view II, Scale B, ther for Patient Context Present is larger than 
that for Patient Context Absent (p < .05). In Interview IV, 
the r for Pat bsent, Scale B, is larger than that for 
Pa xt (p 06). The preceding values of 
p were obtained by using the s transformation (McNemar, 
1955, p. 148). 


nil 


PhD may have shown a slight, but inconsist- 
ent edge over those with the MD, the overall 
results indicate that the reliability of perform- 
ance is very much more a function of experi- 
mental conditions than of professional spe- 
cialty. A conclusion comparable in principle 
was reached by Cutler et al. (1958). 

The two Activity Scales not only led to 
different overall mean ratings of Activity 
Level; interrater reliabilities also differed as 
a function of the particular scale. Scale A was 
more reliably employed for three of the inter- 
views, Scale B being more reliably used with 
Interview IV (Rogers). This is somewhat 
alarming, because the selection of particular 
illustrative points along the empirical scale 
dimension was in the present case (and pre- 
sumably was in several other reported stud- 
ies—e.g., Harway, Dittman, Raush, Bordin, 
& Rigler, 1955) largely an arbitrary matter. 
The empirical differences between the scales 
thus raise an important theoretical issue 
which now deserves comment. 

On the one hand it is quite possible that 
the two scales have different dimensionalities, 
Scale B being, say, two-dimensional, and 
Scale A one-dimensional. (One-dimensionality 
of the Activity continuum has heretofore been 
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assumed, but not empirically proven.) On the 
other hand, it is also possible that the (Ro- 
gerian) responses in Interview IV are in toto 
two-dimensional, while those of the other 
three interviews can be adequately repre- 
sented with a single dimension. Granted these 
two contingencies then it would follow that, 
with Interview IV, Scale B would elicit 
greater interrater reliability than Scale A. It 
is furthermore suggested that Interpretation 
may constitute this second dimension among 
both the items of Scale B and the therapist 
responses in Interview IV. 

The plausibility of the foregoing hypotheti- 
cal argument may be clearer in the light of 
the following considerations. The “reflection,” 
which is the basic and most frequent verbal 
operation in the Rogerian interview (Rogers, 
1951) presumably takes one some distance 
along a dimension of interpretation. In con- 
trast, the other three interviews used in 
Study 5 between them contain less than a 
half-dozen responses that could be classified 
as “interpretive.” Furthermore, inspection of 
Table 3 shows that Item 8 in Scale B con- 
tains the sole reference (in either scale) to 
“reflection of feelings.” It is likely that sub- 
jects employed this category to classify those 
responses in Interview IV which were typi- 
cally Rogerian in nature,’ whereas in Scale A 
no comparable item lay at the subject’s dis- 
posal.° Consequently, the reliability of Inter- 
view IV would turn out, as suggested above 
to be higher with Scale B than with Scale A. 
ane one speaks of Scale B as facilitating 

igher reliability” of ratings for Interview IV 
it must be noted, however, that ratings of 
therapist responses in this interview mark- 
edly drop under both scale conditions when 
patient context is added. This finding is quite 


5 At least 5 of the subjects were clearly aware that 
Interview IV was Rogerian. 

6A rough check bearing out the tenability of this 
hypothesis is as follows. A frequency count across 
all eight raters was made of the frequencies with 
which, for each interview, the eighth item in Scale A 
and Scale B were employed. For Interview I, the 
respective proportions of responses classified in Item 
g were .15 for Scale A, and .02 for Scale B. For 
Interview II the respective proportions were .02 and 
04; and for Interview III, .00 and .10. For the 
Rogerian interview (IV), however, the proportions 
were .30 for Scale A, and .53 for Scale B. 
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opposite to that for Interviews I, II, and Ii; 
and furthermore is not in accordance with 
logical expectation. For,.depending upon the 
nature of some given “exploratory question, 
say, from the therapist, there should be pre- 
dictable effects upon rating-reliability, of the 
addition of patient context. If the referents 
of the therapist question are absolutely clear 
to the rater (ie., if the referents are com- 
pletely defined by the question per se) then 
addition of patient context should not affect 
the reliability of rated Activity Level. If, 
however, the referents of the therapist ques- 
tion are not entirely clear to the rater, then 
addition of patient material should raise (but 
never lower) the reliability of rated Activity 
Level for the particular question. 

The foregoing suggests that the very pres- 
ence of patient context during rating of Ro- 
gerian responses in Interview IV somehow 
undermined the subject’s understanding 0 
what a Rogerian reflection of feeling looks 
like. Indeed, at least in the particular inter- 
view studied here, the reflection frequently 
does not seem, subjectively, to bear any con- 
sistent contextual relation to whatever fol- 
lows from the patient. 

_ From the standpoint of theory and research 
it is desirable to examine in more detail this 
question of dimensionality with respect t 

both Scale B and the Rogerian therapist be 
havior of Interview IV , in hope that a mo! , 

fied scale might be assembled having hig? 
reliability with both non-Rogerian and Be 
gerian material. But this whole issue 15 2 
course an applied offshoot of the more, gers 
eral and fundamental question of the dimer” 
sional relations between elicitation of info E, 
mation, and interpretation of informatio 

(which according to the results of Studies 
a 2 involves relatively high Activity Ter 
oe study along the lines of Pra 
Bordir S| Rigler, Williams, ** 

» Dittmann, and Hays (1956) 18 
Performed in the near future. the 
restric naty, it is felt that, subject tO edt 

ions outlined above which dema” e of 


iS Clarification, we have examined so pility 
i critical variables affecting the relia nd 
0t ratings of therapist Activity Leveli, ply 
that the scales themselves are sufficl? 


i ir ap” 
Meaningful and reliable to justify on 
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plication in further research on the initial as 
well as the therapeutic interview. Attention 
may now be turned toward specification and 
examination of relevant variables in the pa- 
tient’s behavior as a function of therapist 
Activity Level. 


SUMMARY 


This paper describes the development of a 
parallel pair of scales for assessing the Ac- 
tivity Level of discrete Therapist Verbal Re- 
sponses, and the application of the scales to 
several published initial interviews. In Study 
I 30 Board-certified psychiatrists rated 50 
abstract descriptions of Therapist Verbal Re- 
sponses along an 11-point scale of “Activity,” 
the latter being defined in terms of the de- 
gree of “Ambiguity, Lead, and Inference.” 
Interjudge reliability was .50, and the intra- 
class 7, .93. Each rater also categorized the 
50 responses according to whether he con- 
sidered them primarily used for purposes of 
treatment, or for purposes of diagnosis. Those 
therapist responses agreed to be primarily 
“therapeutic” in purpose were rated with a 
considerably higher mean Activity Level than 
others Classified as “diagnostic” in purpose. 

In Study 2 it was shown that a large ma- 
jority of the 50 therapist responses was agreed 
by independent judges to typify one of the 
following conventional categories of therapist 
operation: Simple Facilitation, Exploration, 
Clarificaton, Interpretation, and Supportive 
Reassurance. The responses classified in these 
successive categories, respectively, showed in- 
creasingly higher mean Activity Levels. se 
sequently, it was assumed that the main — 
of 50 responses included representative ele 
ments from the entire range of typical thera- 


pist operations. 
Study 3 invol 4 
vised set of 36 se oe 
belonging mainly in the ca f Sim 
Facilitation, Exploration, and Cieriheat om 
The subjects consisted of 15 of the psyc i 
trists used in Study 1, and 37 freshmen medi- 
cal students, 19 with high, and 18 with pe 
interest in psychiatry. Reliability of ratings 


was only slightly lower for student subjects 


than for psychiatrist subjects; and students 


With low interest in psychi 
(though still highly signi 


ved further rating of a re- 
Verbal Responses 


tegories of Simple 


iatry showed least 
ficant) agreement 
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with the psychiatrists’ rank ordering of the 
therapist responses. Since the rating pro- 
cedure did not appear to be a serious func- 
tion of either professional interest or experi- 
ence, a parallel pair of 10-point scales of Ac- 
tivity Level were assembled using data from 
the psychiatrist subjects. 

In Study 4 the individual therapist re- 
sponses of five unfamiliar published initial 
interviews were rated by both authors, using 
Scale A. Interjudge correlation was .90 or 
better. A more elaborate and rigorous reli- 
ability study was then performed. 

In Study 5 the two Activity Scales were 
then employed in a latin square design re- 
quiring eight untrained raters (four psychia- 
trists and four clinical psychologists) to rate 
for Activity Level the therapist responses in 
four widely differing published initial inter- 
views (by Wolberg, by Gill, by a Finesinger- 
ian, and by Rogers). Scale A vs. Scale B con- 
stituted one factorial variable, and Patient 
Context Absent vs. Present constituted the 
other. The analysis of variance showed a sig- 
nificant difference among the interviews, and 
a significant main effect for the Scale vari- 
able. When interjudge reliabilities were ex- 
amined the two types of subjects (psychia- 
trists and psychologists) showed only minor 
differences. Further, the Wolberg and the Gill 
interviews were consistently more reliably 
rated than were the other two. Scale A, how- 
ever, was consistently more reliably employed 
than Scale B with three of the four inter- 
views, but Scale B was more reliably em- 
ployed with the fourth (Rogerian) interview. 
Furthermore, while adding Patient Context 
either increased or did not affect reliability 
of rating (with either scale) of the first three 
interviews, the reliability of rating the Ro- 
gerian interview clearly decreased. 

The discrepancies involving the Rogerian 
interview were discussed, and a hypothetical 
basis for their occurrence was advanced which 
concerned the dimensionalities of the two 
scales and of Rogerian vs. non-Rogerian 
therapist responses. It is concluded that while 
the general problem of dimensionality needs 
further examination, we have a pair of 
parallel Activity Scales the reliabilities of 
which are comparatively satisfactory (the 
grand median of 112 coefficients is .50), and 
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that we have explored some of the eT 
likely to affect their application by = ra a 
professional raters. One may now min 
ward investigation of patient variables as a 
function of Therapist Activity Level. 
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THE ACCURACY OF CLINICAL PSYCHOLOGISTS’ ESTIMATES 
OF INTERVIEWEES’ INTELLIGENCE? 


ZANWIL SPERBER axnp ARTHUR M. ADLERSTEIN 
Children’s Hospital of Philadelphia, Pennsylvania 


That accurate appraisal of intellectual ca- 
pacity can best be accomplished using stand- 
ardized test procedures is well accepted. There 
are often occasions, however, when clinical 
decisions are influenced by judgments of in- 
telligence which must be based on observa- 
tions rather than tests. It is therefore impor- 
tant to ask, “How good are clinicians’ esti- 
mates of intelligence?” The purpose of this 
paper is to present data indicating the rela- 
tionship between clinical psychologists’ esti- 
mates of intelligence, based on observations 
of only the verbal behavior of interviewees, 
and psychometric measures of the interviwees’ 


intelligence. 


METHOD 


Five clinical psychologists served as 
j ss, Four were PhDs. All had substantial experi- 
DES intelligence testing of adults and children, 
and were familiar with current approaches to the 
conceptualization and measurement of intelligence. 

Interviewees. The women whose IQs were esti- 
mated had been interviewed as part of a yar a | 
study of their children, 4-6 years of age, who ha 


1 The interviews and intelligence test data used = 
the present research were collected as part of a w ay 
of children who had had blood problems as neona s; 
in most cases involving Rh mcompe ay am 
treated by exchange transfusions. The scons | 
study was supported by the National a of 
Neurological Diseases and Blindness, toned 
tutes of Health, United States Public sal! Service 
as part of the Collaborative Project to Stu iy her 
Etiology of Cerebral Palsy and Other Neurolog: 
Diseases of Infancy and Childhood. ss th 

T. McNair Scott is Senior Investigator, r 
Collaborative Project at Children’s P a 

© R. Boggs, Jr. pediatrician; C. Ee aa 
Tologist; and J. A. Rose, psychiatrist, = ie 
Vestigators for the Collaborative Project an 
follow-up study. 

e appreciate the con 
Chologist colleagues who serve 
eth Hirshman’s assistance W1 
Putations, 


Subjects. 


tribution made by our psy- 
ved as judges, and Eliza- 
th the statistical com- 
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been treated medically at birth (see Footnote 1). The 
sample size was set at 30 because the time required 
for data collection given this M reached the limit 
volunteer judges could be asked to contribute. We 
also felt this W was large enough to yield meaning- 
ful statistical results. 

The 30 cases were drawn using a stratified random 
procedure to be representative of the range and mean 
of the intelligence test scores of the larger follow-up 
group. A social class categorization of the husbands’ 
occupations (Sarason & Mandler, 1952; Sperber, 
1959) shows that the sample included members of 
the middle, lower middle, skilled, and unskilled 
worker classes. Table 1 shows the age, education, 
and measured IQ of the sample. 

Interviews. The hour-long interviews were con- 
ducted by psychiatrists and focused on the child’s 
developmental progress, and on maternal attitudes. 
Attempts were made to elicit some description of the 
mother’s life history. The interviews were tape re- 
corded and verbatim transcripts typed. Judges 1, 2, 
and 3 made intelligence estimates after reading the 
transcripts, Judges 4 and 5 after hearing tape re- 
cordings. The judges had no other contact with the 
mothers. 

Intelligence criteria. An abbreviated WAIS (Wechs- 
ler, 1955) consisting of four subtests, Vocabulary, In- 
formation, Block Design, and Picture Arrangement 
was routinely administered to the mothers. The four 
subtests give a good approximation to the IQ ob- 
tained using the full scale (Cohen, 1957; Doppelt, 
1956; Himelstein, 1957). Hereafter the prorated IQ 
based on scores on the four subtests will be called 


TABLE 1 


AGE, EDUCATION, AND MEASURED IQ 
OF THE INTERVIEWEE 


Variable N* Mean SD Range 
Age (in years) 30 36.2 6.5 25-54 
Years of education 25 11.8 1.9 8-16 
1Q criteria 
WAIS 29 101.7 13.5 68-135 
Vocabulary 30 104.5 18.1 70-149 


s Number of cases with relevant data available. 
b Based on four subtests: Vocabulary, Information, Block 
Design, Picture Arrangement. 
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TABLE 2 


CORRELATIONS BETWEEN JUDI IQ ESTIMATES 
AND OTHER CRITERIA 


aca Mean 
7 5 r 
IQ criteria Noa @ 8 4 6 ee 
Vai 29 .43* .82 .76 .72 .69 .70 
Wais 2 ‘ BS T 
0O .67 .80 .78 .72 .76 
abulary 3 6 2 7 7. 
Koa Judges"* 30 .74 .77 .72 .77 .78 -76 


3 Entries are mean porraanons Debaeee indicated judge's 

i the other four judges. iat 
aaa Ea aA thossiob the of her four UIRE ia ionat at 
the p <01 level. 


4 IQ. WAIS IQs were available for 29 
oe A i last phase of the follow-up study 
only the Vocabulary subtest was given. One of the 
mothers inadvertently included in our sample was 
from the later group. 

Since verbal production served as the judges’ pri- 
mary source of information about the interviewees? 
intelligence, a prorated IQ based only on the inter- 
viewees’ scores on the WAIS Vocabulary test was 
used as a second criterion, Hereafter the prorated 
IQ based on the Vocabulary subtest will be called 
the Vocabulary IQ. 

Procedure. Judges were asked to 
viewees’ IQs with no further dis 
should define intelligence or us 
terial. They made their judgments independently, 
specifying a exact number for estimates between 70 
and 140, and indicating after e: 


‘ach estimate whether 
the judgment had been made with high or low con- 
fidence. 


Judges were aware of the general nature of the 
follow-up study and knew the mothers had taken an 
abbreviated WAIS, For each case they were told the 


as the subject of the inter- 
‘anged to form five random 
sequences of IQs, Each judge followed a different 
sequence in making his estimates, 


assess the inter- 
cussion of how they 
e the interview ma- 


RESULTS 


Correlational analyses, Product 
relations were calculated between each judge’s 
estimates and (a) the WAIS IQ, (b) the Vo- 
cabulary IQ, (c) each other judge’s IQ esti- 
mates. The 10 coefficients between estimated 
and measured IQ, presented in Table 2, are 
positive and, with one exception, substantial, 
The estimates of all possible pairs of judges 
were positively correlated at the 01 level. 
Table 2 shows the mean correlations between 
each judge’s estimates and those of the other 
judges. Presence of voice cues did not influ- 
ence the correlations between judges’ esti- 
mates, or between estimates and criteria. 


-Moment cor- 
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Discrepancies between measured IQ and 
estimated IQ. The mean IQ assigned by each 
of the five judges ranged from 100.5 to 105.3, 
corresponding closely to the test means shown 
in Table 1. The SDs of judges’ estimates were 
somewhat smaller than the SDs of test scores. 
Four judges restricted their estimates to an 
80-126 IQ range. 

As indicated in Table 3 judges atiman 
often deviated appreciably from the measured 
IQs. Over all five judges the mean discrep- 
ancy between the estimated IQs and WAIS 
IQs was 7.8 points. The mean discrepancy 
from the Vocabulary IQ was 9.9 points. Con- 
sidering only the WAIS criterion in relation 
to which judges’ estimates were more aer 
rate, 83% of the estimates made by the a 
accurate judge, and 66% made by the Jean 
accurate judge, were within 10 IQ points at 
the criterion. Over all five judges, 727 ° 
the estimates were within this range. : 

Table 3 presents the result of two addi- 
tional analyses. Knowing that the WAIS was 
standardized so that the mean IQ e a ‘ 
sample representative of the “population . 
large” would be 100 (Wechsler, 1955) 2o 
accurate would a clinician be if he simpy 
“programed” himself to estimate each inter 


TABLE 3 


DISCREPANCIES BETWEEN ESTIMATED IQ 


AND MEASURED IQ 


5 
Source of estimate critics Discrepancy 
e 
Judge Mean SD Rang 
0-25 
$ WAIS 89 GF 139 
Vocabulary 10.3 9. oii 
2 WAIS 5A 47 gal 
Vocabulary 9.2 7. 1-18 
is WAIS 7o 55 4-25 
Vocabulary 9.8 7. 0-28 
$ WAIS Sr P 4038 
Vocabulary 10.4 Et 
s WAIS 32 $3 4-28 
Vocabulary 9.6 i 
Assumed Populatio: 35 
a 0-35 
mean (IQ = 100) wars 11.0 8.0 0-89 
Vocabulary 15.1 11.0 
WAIS IQ vs at 
Vocabulary 19 é na ? 
ay ocabula"” 


Vi 
criterion,” for WAIS 19 criterion; N = 30 for 


h- ai 


Estimates of Interviewees’ Intelligence 


viewce’s LQ as 100? * As shown in Table 3, 
the mean, SD, and range of discrepancies ob- 
tained by a hypothetical programed judge 
would have been larger than those of of our 
judges. Table 3 also shows the mean, SD, and 
range of discrepancies between the inter- 
viewees’ WAIS IQs and their Vocabulary 
IQs. Despite test overlap, the discrepancies 
between the two psychometrically derived 
IQs are not appreciably less than those ob- 
served between judges’ estimates and the 
WAIS criterion. : 
Judges’ confidence and their IQ estimates. 
The five judges differed markedly with re- 
spect to the confidence with which they made 
the IQ estimates. Judge 1 was highly confi- 
dent on only four estimates, Judges 4, 5, and 
3 on 13, 15, and 16 estimates, respectively, 
while Judge 2 was highly confident on 23 
estimates. This degree of variability suggests 
that the source of confidence was unique to 
the judge and not a function of some ob- 
servable attribute of the interview material. 
To test this supposition judges were paired 
in all 10 possible combinations. The percent- 
age of cases where both judges agreed in feel- 
ing either high or low confidence was com- 
pared to the percentage of agreements ex- 
pected by chance. Agreement ranged from 
30% to 53%, sr below chance expectancy 
- five pairs of judges. 
och dane to which judges tended to feel 
highly confident after making IQ estimates 
was examined in relation to their perform- 
ance. The judges were ranked for — 
level, assigning Rank 1 to the judge wit i e 
largest number of high confidence estimates. 
The criterion for a judge’s performance — 
IQ estimator was the average ae 
tween his IQ estimates and the v i 7 
the judge with the smallest average oe “3 
ancy being assigned Rank 1. The E n i 
rank-order correlation was .90, a M) 
b < 05 (Senders, 1958, P- sas eee 9 
Within judges there was a slight ea 
sistent reversal in the relationship sp 
Confidence and accuracy. Biserial correlations 


etween the judge’s confidence (high or low) 
and size of the discrepancies between his esti- 


Mates and the WAIS IQs ranged from .03 to 


“We wish to thank our colleague, Edna Small, 


a Suggested this analysis. 
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-18, indicating that judges tended to be more 
accurate on those estimates they made with 
less confidence. 


Discuss1on 


The results are consistent with the findings 
of other investigators who have reported cor- 
relations between intelligence estimates made 
without benefit of psychometric data and in- 
telligence test scores (Hanna, 1950; Marsh & 
Perrin, 1925; Wilson, 1954). Substantial dis- 
crepancies between measured and estimated 
IQs did occur in individual cases, but 72% 
of the judges’ estimates were within 10 points 
of the WAIS criterion, and those of the more 
accurate judges did not deviate from the 
WAIS IQ any more than did a measured 
IQ based on the WAIS Vocabulary score. 

Judges apparently are realistic in deciding 
how much confidence, in general, to place in 
their IQ estimates. There was a direct rela- 
tionship between the number of judgments 
made with high confidence and the accuracy 
of judges. Written comments volunteered by 
three judges suggest that the contrary tend- 
ency for all judges to be a little more accu- 
rate on low confidence judgments compared 
to their own high confidence judgments was 
a function of their considering additional as- 
pects of the interviews on cases perceived as 
difficult to evaluate. 

Some of the larger discrepancies in judg- 
ment occurred because judges overestimated 
the IQ of the cases of low intelligence and 
underestimated the high extremes. A trained 
observer’s estimate is, therefore, not to be 
considered a substitute for a good intelligence 
test where precise data are required, although 
it may be sufficient when a general idea of 
the client’s intellectual capacity is all that is 
needed. Within these limits the present study 
indicates that experienced psychologists can 
make clinically useful estimates of inter- 
viewees’ intelligence. The findings should not 
be generalized to teachers, parents, physi- 
cians, or other judge groups without further 
research. 


SUMMARY 


Five clinical psychologists estimated the 
IQs of 30 mothers who had been interviewed 
by psychiatrists, three judges after reading 
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transcripts and two after hearing tape re- 
cordings of the interviews. IQ estimates were 
compared with the prorated IQ based on 
WAIS subtests. 

Correlations between estimates and the 
WAIS criterion were significant (mean r 
= .70), with 72% of the estimates within 10 
points of the criterion. More confident judges 
were more accurate in their estimates. 

The results indicate that experienced clini- 
cal psychologists can make useful estimates 
of interviewees’ intelligence. 
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STIMULUS GENERALIZATION IN BRAIN 
DAMAGED CHILDREN* 


SARNOFF A. MEDNICK 


University of Michigan 


Stimulus generalization (SG) can be said 
to have occurred when a response previously 
trained to be elicited by stimulus O can also 
be elicited by test stimuli similar to O. This 
phenomenon has been used extensively in ex- 
planation of verbal learning (Gibson, 1940), 
social activity (Hull, 1950), and clinical be- 
havior (Mednick, 1958b). A study by Med- 
nick has suggested that the behavioral deficit 
of the brain damaged adult usually described 
by the term “concrete” may also be under- 
stood in terms of SG. He found that the SG 
responsiveness of these patients was sharply 
curtailed (Mednick, 1955). Research has 
suggested that the concreteness observed in 
the brain damaged adult has its counterpart 
in the child. For example, Cotton was par- 
ticularly struck by the similarity between her 
group of children suffering from cerebral 


and the brain damaged adult (Cotton, 


pan it seemed 


1941). In view of these findings, t 
advisable to compare the generalization reac- 
tiveness of brain damaged and intact children. 
of the previously obtained results 
it was hypothesized that brain 
i strate less SG 


In terms 
with adults, 
damaged children would demon 


than intact children. 


METHOD 

Apparatus. The apparatus was adapted a y 
devised by Brown, Bilodeau, and Baron ( pean 
consisted of a horizontal row of 11 lamps tas 


72 
k ‘ved plywood panel 6 feet by 2 
ey Say on a table. The lamps 


feet, mounted on its long edge } e l; 
were spaced 9 degrees apart and were nein 
from the subject’s eyes when the subject was sca 
Baca . Re aS 
1The authors wish to express their mengce” 
for the help and advice of oe M a a 
dren’s Hospital, Boston, Massachusetts, and © npin 
Of the SARE hool, Waltham, Massachus' ts. The 
Work was partially supported by a United ape 
Public Health Service Grant No. M 1519 to Te 
Senior author while he was at Harvard University. 
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Yale University 


directly in front of, and 3.5 feet away from, the 
center lamp. The lamps will be designated by Num- 
bers 1 through 11, with Lamp 1 being on the left of 
the subject, Lamp 11 on the right of the subject, and 
Lamp 6 being the center lamp. (The lamps used in 
this study were 1, 3, 5, 6, 7, 9, 11.) A red jeweled 
flashlight lamp, 2 inches above the center lamp, 
served as a fixation point and a ready signal. The 
reaction key was placed on the subject's lap. The 
experimenter was seated behind the panel out of the 
subject’s view. 

s. Thirty-six children were tested in this 
y ghteen were patients at the Cerebral Palsy 
Clinic of the Children’s Hospital in Boston. The ma- 
jority of them were diagnosed as spastic, but several 
athetoid cases were also included. There were 10 
boys and 8 girls, ranging in age from 6-6 to 14-10 
years, and in intelligence from Mental Defective to 
Superior. The 18 children in the Control group (16 
boys and 2 girls) were equated with the Cerebral 
Palsy (CP) group for age and IQ. Five mental de- 
fective children, with no evidence of organic brain 
damage, were matched individual for individual with 
respect to IQ and age with the 5 cerebral palsied 
children with subnormal IQs; the remaining 13 sub- 
jects of normal intelligence in the Control group 
were taken from a previous study (Mednick & 
Lehtinen, 1957). None of the CP children used in 
the study had a known visual defect. They all had 
full use of at least one arm and hand. 

Procedure. Subjects were set to lift their hand 
from the reaction key as quickly as possible when 
the center lamp was lit. They were told that other 
lamps would be lit occasionally, but that they were 
only to respond to the center lamp. Subjects were 
encouraged to respond as quickly as possible. The 
latency of response was measured to the nearest 
one-hundredth of a second with a Standard Electric 
Timer. Two criteria were decided upon to deter- 
mine whether the subject was capable of performing 
the task. First, the experimenter went through the 
instructions with the subject as many times as was 
necessary for him to be able to repeat them cor- 
rectly. Somewhat more explanation usually proved 
necessary for the CP child than for the intact child. 
Secondly, a behavioral test of the subject’s ability 
to understand and perform the task was also em- 
ployed. After the instructions, the subject received 
two demonstration-test trials. If the subject re- 
sponded inappropriately, he was discarded. No in- 
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TABLE 1 


PROPORTION OF SUBJECTS THAT RESPONDED AT Least ONCE AND TOTAL NUMBER 
or Responses at Eacn Test Lamp 


Test lamp 


Group 1 3 


5 = 1G; 7 9 11 


Proportion of a 
group responding 
at least once 

Cerebral palsy 
Control 

Total number of 

responses 


Cerebral palsy 6 3 
Control 


50 1.00 


39 
-18 1.00 


72 


19 20 


tact children were discarded; 10 CP children were 
discarded. 

Ten consecutive training trials with the center 
lamp (10-15 seconds intertrial intervals) were then 
given. The training trials were followed without 
warning by a test series during which six of the pe- 
ripheral lamps (Lamps 1, s s. 
sented twice each, interspersed 
trials with the center lamp in 
der. The total number of trials 
29, 17 with the center lamp 
ripheral lamps. Zero, one, 
booster trials intervened 
trials with the peripheral lamps. Six different or- 
ders were used for the test trials, each order begin- 
ning with a different peripheral lamp, Three sub- 
jects were assigned to each order from the CP and 
Control groups. Approximately 50% verbal rein- 


7, 9, 11) were pre- 
with 17 “booster” 
a counterbalanced or- 
in the test series was 
and 12 with the pe- 
two, or three center lamp 


ssive test 


TABLE 2 


FREQUENCY DISTRIBUTION Comp. 


ARING Groups ON 
STIMULUS GENERALIZATION 


RESPONSIVENESS 
Number of individuals 
Cerebral 
Number of SG responses palsy Control 
0 3 1 
1 1 1 
2 6 2 
3 4 1 
4 4 
5 6 
6 1 
7 2 
8 2 
Total 18 1 


forcement was used to keep the subject concentrated 
on the task and to promote optimal reaction times- 


RESULTS 


In previous research using voluntary Te 
Sponse measures of SG responsiveness, no Tê- 
lationship has been found to exist betwee 
latency and frequency measures of SG (Gib- 
son, 1939; Mednick, 1955; Mednick & Freed- 
man, 1960; Rosenbaum, 1953) except under 
special conditions (Mednick, 1958a). These 
results were also observed in this experiment. 
The two groups did not differ significantly 1 
mean latency of response on the training 
trials (the mean latency for the CP ot: 
Was .393; the mean latency for the Contro 
group was 334) nor was there a relationship 
between latency and frequency of respons’ 
When these variables were dichotomized ani 
Subjected to chi square anlaysis. 


he frequency generalization data are p°- 
sented in Table 1 in the form of the propo" 
tion of subjects in each group responding 4 
east once to a given lamp. As can be sera 
the CP group showed less SG responsivenes 
than the Control group at every lamp. r 
1s also reflected in a count of the total nu™ 
ber of responses made at each lamp by + 
two groups (also in Table 1). Pe 
€ first SG test trial is considered an 2 
Portant indicator of SG responsiveness, an 
a 'S relatively untainted by the effects s 
discrimination and extinction, On this t° 


trial 14 of the 18 Control subjects responee”’ 


| 


Stimulus Generalization in Brain Damaged Children 


while only 7 of the 18 CP subjects responded. 
This difference is significant (chi square, cor- 
rected = 4.11, df = 1, p < .05). 

Table 2 presents a frequency distribution 
comparing the SG responsiveness of the CP 
and Control children. While none of the CP 
children gave more than four SG responses, 
11 or 61% of the Controls showed five or 
more responses. The group differences are 
significant (chi square = 15.84, af=2, p 
< .01). This test was performed by combin- 
ing Rows 0-2, Rows 3 and 4, and Rows 5-8, 
collapsing Table 2 into a 3 X 2 table. 


Discussion 


The hypothesis that the brain damaged 
children would evidence a diminished degree 
of SG responsiveness is supported by the re- 
sults. It seems likely that this finding may 
help explain the behavior of the brain dam- 
aged child, which has been described as con- 
crete. An often-cited clinical example of con- 
crete behavior concerns the child who has 
been trained to complete a task seated in a 
certain way at a certain table. However, 
when his position is altered or table changed, 
he is no longer able to perform the task. 
Clearly, this could also be explained as an 
instance of failure of SG. The second stimu- 
lus situation differed from the first; SG did 
not occur. 

This way of thinking of the problems of 
these children has certain advantages. For one 
thing, we can look at the teaching materials 
for these children in a more differentiated 
manner. If we want the child to respond with 
the same response to two different stimulus 
situations (grasping “abstract” concept); we 
should eliminate all unessential differences in 
the stimuli, since these will hamper generali- 
zation. In addition, we have an experimental 
literature in SG (recently reviewed by Med- 
nick & Freedman, 1960), on which we can 
draw for suggestions or manners to augment 
SG responsiveness. Thus, it has been shown 
that greater SG responsiveness is manifested 
under higher drive levels (Brown, 1942; 
Mednick, 1957; Rosenbaum, 1953). In ad- 
dition, within limits, greater training in giv- 


ing a response to a stimulus will result 


in augmented SG responsiveness to similar 
stimuli (Margolius, 1955; Thompson, 1959). 
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SUMMARY 


The hypothesis that brain damaged chil- 
dren suffer reduced SG responsiveness was 
tested and supported. SG was measured along 
a visual-spatial dimension with an apparatus 
that required a voluntary response. Some ob- 
servations were made regarding the implica- 
tions of this finding for the training of the 
brain damaged child. 
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If one conceptualizes anxiety as an emo- 
tional arousal state that varies from situation 
to situation and even in the same situation 
from time to time, it becomes important to 
take this variability into account in the con- 
struction of any paper and pencil test de- 
signed to measure the construct, Ordinarily the 
instructions accompanying most paper and 
pencil tests of personality traits are vague 
with respect to the time interval for which the 
person is rating himself, although the usual 
implication is that the person is 
items in terms of how he h 
during his life. 

The first 


answering the 
as generally been 


aim of the present research was to 
administer the same anxiety scale to separate 
groups with instructions given in terms of 
three different time intervals, and observe how 
the correlations of the scale with a subsequent 
criterion assessment of anxiety in a specific 
Stress situation were affected. The time in- 
tervals selected were “last 2 weeks, last 6 
months,” and “in general.” The criterion as- 
sessment was made in individual sessions 
about a month later. It was thought that the 
“last 2 weeks” and “in general” groups would 
show less correlation with the criterion meas- 
ure than the “last 6 months” group, “Last 2 
weeks” should be low because this time in- 
terval would be most affected by transient 
fluctuations, and “in general” should be low 
because it covers too long a time interval, 
whereas “last 6 months” might more likely 
tap the more recent characteristic level of 
anxiety. 

The second aim of the study was to vary 
the degree to which variance attributable to 
the tendency to respond in terms of the so- 
cial desirability of the item was removed 
from the scale. An attempt to remove this 


source of variance was made by applying dif- 
ferent scoring procedures to a forced-choice 
format of item triplets similar to that used 
by Heineman (1953). One scoring procedure 
was thought to minimize social desirability 
variance, a second to be heavily affected by it, 
and a third was devised to measure more di- 
rectly the social desirability variance itself. 
It was expected that the procedure that mint- 
mized social desirability variance would yield 
scores most highly correlated with the cri- 
terion. ‘ 
It was also expected that social desirability 
variance, itself, would be differentially af- 
fected by the three instructional time inter- 
vals. First, it is reasonable to suppose that a 
Person would be More willing to admit to 50- 
cially undesirable attributes if he were saying 
that these were true for a 2-week period thew 
for 6 months or jn general, and accordingly 
the mean social desirability scores should de- 
crease with increasing time intervals. Secondly, 
if one of the Scoring procedures does indeed 
S ccessfully remove part of the social cean 
ability variance, then the mean scores iat 
that procedure should vary less as a function 
of instructional time interval than the scores 


s 
for the Procedure that includes more of thi 
variance, 


Previous res 
showin 
the d 


ance 


earch is meager with respect af 
8 differential validity as a function ri- 
egree to which social desirability oe 
a 1S removed from paper and pencil ae 
lety measures, Edwards (1957) has poi” 


. i Ss 
up the pervasiveness of this variance man 
developed a scale to measure it. Heine 
(1953) attem 


e 
pted to rid the Taylor MA rs 
of this variance by constructing a tor jae 
choice version, and showed that the cor! jë- 
tion with the MMPI K scale could be 
28 


ae”. 


Instructions and Social Desirability 


duced. Silverman (1957) found that Heine- 
man’s forced-choice form correlated .24 (p 
= .05) with base level palmar conductance 
obtained before a stress session, whereas the 
regular Taylor JA scale correlated only 
—.02. Martin (1959) reported a correlation 
of .44 between base level palmar conductance 
during stress and a forced-choice scale com- 
posed of adjective triplets taken immediately 
after the stress session and a correlation of 
—.02 between the same measure and the 
regular Taylor JA scale taken earlier in a 
group session. The correlation of .44 was ob- 
tained in a group that took the adjective 
triplets scale with instructions to answer in 
terms of how they had just been feeling dur- 
ing the stress session. Two other groups that 
were told to answer in terms of how they had 
been feeling during the last month and in 
general, respectively, showed no significant 
correlations with palmar conductance. 


PROCEDURE 


Construction of the Forced-Choice Scale 


The 20 items from the Taylor MA scale which, in 
independent item analyses by Buss (1955) and Hoyt 
and Magoon (1954), had been shown to discriminate 
between criterion groups, were used as the anxiety 
items in the present scale. These were the same 20 
items used by Bendig (1956) in the development of 
a short form of the Taylor MA scale. Twenty-eight 
other items were selected from the MMPI on the 
basis of a priori judgment as to their not being di- 
rectly related to jety and their involving person- 


anxi 
ality characteristics that were subject to some fluc- 
tuation. The wording of the 


items was changed, 
where necessary, SO that all items were stated in the 
past perfect tense. This was done to make it appro- 
priate to answer the items in terms of a specific past 
time interval. 


All 48 items were then rated for social desirability 
by 110 students from an introductory psychology 
class on seven-point rating scales. Forty triplets of 
items were then composed following the format of 
Heineman (1953) in which an anxiety item was 
paired with a nonanxiety item of equal ao aie 
sirability, and a third nonanxiety item was adde 
which differed by approximately two scale units in 
social desirability (either plus or minus) from the 
first two items. Each anxiety item appeared twice in 


the 40 triplets. 


The Scoring Procedures 
i is inve jects were asked to 
In taking this inventory, su i 
select the item in each triplet that was most like 
them and the item that was least like them. Scoring 
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Procedure A was rather complicated and represented 
an attempt to remove social desirability variance. In 
brief, the scheme was as follows: 


Anxiety item most, nonmatched item 

least: 3 points 
Anxiety item most, nonanxiety non- 

matched item least: 2 points 


Nonanxiety nonmatched item most, anx- 


iety item not marked: 2 points 
Nonanxiety nonmatched item most, anx- 

iety item least: 1 point 
Nonanxiety matched item most, anxiety 

item not marked: 1 point 
Nonanxiety matched item most, anxiety 

item least: 0 points 


The logic behind this approach is perhaps best 
illustrated by examining the 3-point and 0-point com- 
binations. In the former it can be seen that the sub- 
ject is saying that the matched nonanxiety item is 
least like him. If putting himself in a favorable or 
unfavorable light had been important, it is more 
likely that he would have placed the nonanxiety non- 
matched item in either the most or least categories, 
rather than leaving it in the middle. In the 0-point 
combination we have the situation in which answer- 
ing the anxiety item as least like overrides considera- 
tion of so desirability since the matched non- 
anxiety item is marked most like. 

Scoring Procedure B consisted simply of giving 2 
points if the anxiety item was marked most like, 1 
point if left unmarked, and 0 points if marked least 
like. This score should be influenced considerably by 
social desirability variance. 

The third scoring procedure represented an at- 
tempt to measure the social desirability variance it- 
self, although as a result of the nature of the triplet 
construction, there must inevitably be some negative 
correlation with the anxiety dimension. The variable 
was scored as follows, with a high score representing 
the tendency to say unfavorable things about one- 
self: 2 points if nonanxiety, nonmatched item was 
marked least like; 1 point if left unmarked; and 0 
points if marked most like. 


Subjects and Group Testing 


Small groups of volunteer subjects from an intro- 
ductory psychology course were seen until a total of 
40 male and 40 female subjects in each of the three 
instructional conditions had been administered the 
Forced-Choice Anxiety scale. The three instructional 
conditions were obtained by asking the subjects to 
answer the scale in terms of how they had been dur- 
ing (a) the last 2 weeks, (b) the last 6 months, or 
(c) in general. 


The Individual Stress Session 


A random sample of 40 subjects (20 male, 20 fe- 
male) was selected from each of the larger group 
tested samples, and contacted for the individual ses- 
sion which occurred on the average about a month 
after the group session. A more complete description 
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TABLE 1 
CORRELATIONS AMONG THE THREE 
CRITERIA MEASURES 
Tnitial , 
conductance Rating 
Ms Male Female Male Female 
Systolic change 31 J3 -28 30 
Initial conductance 16 11 


Note.— of .25 is significant at the .05 level for V = 60, 


of the stress procedure may be obtained in Bergs 
(1960). Briefly, the subject was confronted in close 
proximity by two experimenters in a small room. The 
experimenters watched the subject closely throughout 
the session and rather obviously made notes and rat- 
ings. The subject was told at the beginning, 


In this experiment we are going to ask you to do 
several things. First, we will ask you to tell us 
what you see on a Rorschach test ink blot. For 
the second part we will ask you to tell us what- 
ever comes to your mind. We believe that your 
telling us everything that comes to your mind and 
your responses to this Rorschach card, together 
with this apparatus [points to GSR apparatus], 
will help us understand what your hidden feelings 
and emotions are, and tell us something about the 
kind of person you are. But first we want you to 


sit silently for another couple of minutes before 
we get started. 


Following this anticipation period, the experi- 
menter turned on a ta 


Pe recorder and presented Ror- 
schach Card II. After the subject had responded, the 
experimenter commented, “Those responses are not 


as well integrated as they might be.” 

The subject was then asked 
a couple of minutes, durin 
mildly criticized for not “ 
that came to mind. 


to freely associate for 
g which he was again 


really” saying everything 
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Continuous recording of palmar skin conductance 
and measures of blood pressure at predetermined in- 
tervals were obtained during the session. At the end 
of each session the two experimenters rated the sub- 
ject on a seven-point scale in terms of how mani- 
festly anxious the subject appeared to be during the 
session. The correlations between the independent 
ratings of the two experimenters were .62 for the 
female subjects and .73 for the male subjects. On 
the basis of the intercorrelations among the stress 
measures, a criterion index of anxiety was composed 
based on the sum of the standard scores for (a) base 
level conductance, (b) change in systolic blood PIES. 
sure, and (c) the average of the two experimenters 
anxiety ratings. Intercorrelations among the three 
measures composing the index are shown in Table 1. 
There were no significant differences between the 
means of these scores for males and females. 


RESULTS AND DISCUSSION 


Correlations between the criterion index of 
anxiety and the Anxiety scale scores for the 
various instruction groups and scoring pro- 
cedures are shown in Table 2. The correla- 
tions are presented separately for males and 
females, and it is apparent that the male sub- 
jects yielded no significant correlations under 
any condition. The negative correlational re- 
sults for the male subjects suggests caution In 
interpreting the other findings; however, aS 
will be seen the female subjects yield results 
highly consistent with the theoretical expec- 


tations., 
, For the female subjects the highest correla- 
tion, .62, is for the “6-month” instruction 


Sroup for Scoring Procedure A, the one de- 


signed to reduce social desirability variance. 
The correlation for Scoring Procedure A UP- 
der “in general” instructions, .49, is also sig 
nificantly different from zero (p < .05) bu 


TABLE 2 


CORRELATIONS BETWEEN THE 


ANXIETY INDEX Ay 
IN THE DIFFEREN 


D THE THREE SCORING PROCEDURES 


x ] R T GROUPS 
5 a Ee — 
E" Instructional time interval 
In general in 
Série 6 months 2 weeks 
dure Male Femal 
procedu A _esmale Male Female Male Female 
—.15 49 = —— e 

A — ói 00 we 02 18 a 

‘i irabilit 21 42 ‘ AS -21 -0 
Social desirability -16 10 ‘24 38 

Note.—y of .44 is significant at the .05 level for N 


= 20, 


a 
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TABLE 3 


MEANS AND SIGMAS OF THE THREE SCORING PROCEDURES FOR MALES AND FEMALES COMBINED 


Scoring procedure 


A B Social desirability 

Instruction 
group Mean Sigma Mean Sigma Mean Sigma 
In general 69.88 10.06 37.96 6.16 73.238 6.50 
6 months 69.85 14.21 39.40 1141 75.33 13.00 
2 weeks 72,52 9.93 41.78 8.37 76.32 7A7 


E a A high TOA represents a tendency to admit to socially u 
not significantly different from the correlation 
obtained for the “6-month” instruction group. 
The correlation of .10 for the “2-week” in- 
struction group is significantly less (p < .05) 
than the correlation obtained for the “6- 
month” group. Thus, as expected, for the fe- 
male subjects, the “2-week” instruction does 
have the lowest predictive validity, perhaps 
because it reflects unduly the more transient 
states of anxiety. And although the difference 
between the ‘“6-months” and “in general” 
correlations is in the expected direction, the 
difference failed to reach significance. 

Scoring Procedure B, where no attempt was 
made to reduce social desirability variance, 
did not yield any significant correlations. By 
employing the somewhat dubious procedure 
for testing for the significance of the difference 
between correlations based on the same sub- 
jects (McNemar, 1949, p. 124), it was found 
that the correlations for Scoring Procedure A 
were significantly higher ($ < .05) than the 
correlations obtained with Scoring Procedure 
B for both the “in general” and “6-months” 
groups. 

None of the correlations for the social de- 
sirability scoring procedure was significant, al- 
though the correlations were of substantial 
magnitude for both the “in general” and “2- 

ks” groups. 
is ou and sigmas for the different in- 
struction groups and different scoring pro- 
cedures are shown in Table 3. There were no 
significant differences between male and fe- 
male subjects for any of these means and, 
accordingly, the two sex groups were com- 
bined to yield an V of 80 for each instruction 
group. It can be seen that there is a general 


ndesirable characteristics. 


tendency for the means to increase as the in- 
structional time interval decreases. For Scor- 
ing Procedure A this tendency is not signifi- 
cant as tested by analysis of variance. For 
both Scoring Procedure B and the social de- 
sirability scoring procedure, there is a sig- 
nificant effect (p < .05) of instructional time 
interval upon these mean scores. These re- 
sults are consistent with the theoretical ex- 
pectation that subjects would admit to more 
unfavorable characteristics as the time inter- 
val decreases. However, one cannot conclude 
that Scoring Procedure B manifests this effect 
more than Scoring Procedure A. A correlated- 
measures analysis of variance was performed 
on the A and B scoring procedures, and the 
interaction of scoring procedure by instruc- 
tional time interval was not found to be sig- 
nificant. 

In conclusion the results of the present re- 
search indicate that both the instructional 
time interval and social desirability variance 
affect the predictive validity of a paper and 
pencil test of anxiety. It was also found that 
subjects are likely to say more unfavorable 
things about themselves when the time inter- 
val being reported on is short. 

It was not the purpose of this paper to 
publish a new psychometric test and, in fact, 
item analyses of the present scale (not re- 
ported in this paper) suggest that many of 
the item triplets are not predictive at all. The 
completely negative results for the male sub- 
jects emphasize this point. 


SUMMARY 


The primary purpose of this research was 
to study the effect of instructional time in- 
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, social desirability variance upon 
a Sade of a forced-choice anxiety scale. 
Subjects in three different groups were — 
to answer the scale in terms of how they ha 
been during the last 2 weeks, last 6 months, 
and in general. The forced-choice triads were 
then scored by Procedure A, which was de- 
signed to reduce variance associated with the 
social desirability of the items, and by Pro- 
cedure B, which was presumed to be heavily 
affected by the social desirability of the items. 
A criterion index of anxiety was obtained in 
an individual stress session, and was based on 
skin conductance level, change in systolic 
blood pressure, and a rating of anxiety. 
‘The results were entirely negative for the 
male subjects; no significant correlat 
found for any instruction group or 
scoring procedure. For the female subjects the 
results were in accord with the theoretical ex- 
pectations. The highest predictive validity 
was obtained for the “6-month” instruction 
group for the scoring procedure that was de- 
signed to minimize social desirability vari- 
ance. The correlation with the Criterion was 
also significant for the “in general” instruction 
group for Scoring Procedure A. No criterion 


correlations were significant for Scoring Pro- 
cedure B. 


A second aim of th 
the effects of instructi 


ions were 
for either 


e research was to study 
onal time interval upon 


and Barclay Martin 


the mean social desirability scores, which 
were assessed by a third scoring procedure. 
As expected, subjects tended to admit to more 
socially undesirable characteristics as the in- 
structional time interval decreased, 
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RESPONSE STYLES IN CLINICAL 
AND NONCLINICAL GROUPS 


H. J. 


University Health Cen 


Over the past decades, conceptions of per- 
sonality measurement have undergone various 
transitions. One change is reflected in an in- 
creased interest in molar response character- 
istics conceived as response sets (Cronbach, 
1946), or response styles (Jackson & Messick, 
1958), elicited by the verbal stimuli of ques- 
tionnaires. Two response sets have been of 
particular interest currently. One, Edwards 
(1953) describes as the “social desirability 
variable.” He and other investigators found 
high correlations in the vicinity of .87 be- 
tween rated item social desirability and av- 
eraged scores of student groups on self-de- 
scriptive questionnaire items. The other is a 
response set which Cronbach (1946, 1950) 
termed acquiescence or a tendency to agree 
(or disagree) with items irrespective of their 
content. In their recent review, Jackson and 
Messick (1958) conclude that: 


In the light of accumulating evidence, it seems likely 
that the major common factors in personality inven- 
tories of the true-false or agree-disagree type... 
are interpretable primarily in terms of style rather 
than specific item content (p. 247). 


If the major common factors in personality 
inventories are interpretable in terms of re- 
sponse (R) styles, then two groups which 
differ significantly on a scale of psychiatric 
symptomatology should also differ signifi- 
cantly in terms of R styles shown by the 
members. A sample of patients that cannot 
be so discriminated from controls should not 
show different proportions of R styles than 
controls. Also subgroups of subjects from dif- 
ferent clinical and nonclinical groups who ex- 
hibit the same R styles on one questionnaire 
should have comparable scores on different 
scales. Furthermore, if R styles may be in- 
terpreted as the major common factor rather 
than specific item content, scores obtained by 
Subjects with one set of personality items 


WAHLER 
ter, Ohio State University 


should covary with their scores derived from 
different scales with different content, the di- 
rection being consistent with the bias of their 
R style. 

Messick and Jackson also point out that R 
set studies have tended to focus on one or 
another R style such as acquiescence or social 
desirability without studying both conjointly. 
One point which particularly bears special at- 
tention is the possibility that the set to agree 
or disagree may interact with item desir- 
ability. 

The purpose of this study is to investigate 
the above propositions which may be briefly 
restated as four questions: (a) Are significant 
differences found between clinical and non- 
clinical groups in terms of the frequency with 
which different R styles occur when the clini- 
cal group can be differentiated from controls 
by a scale of general psychiatric symptoma- 
tology, and when clinical and control groups 
cannot be so discriminated? (b) Do subjects 
who show the same R styles in self-ratings ob- 
tain comparable scores with true-false scales 
in spite of their being members of different 
clinical and nonclinical groups? (c) Do sub- 
jects exhibit the same R sets with different 
items and modes of responding? For example, 
if they tend to deny traits on a self-rating 
scale do they show the same tendency with 
another set of items and a true-false mode of 
responding? (d) Is the tendency to claim or 
deny undesirable traits related to claiming or 
denying desirable traits or are these tenden- 
cies independent? Do the clinical and control 
samples differ in this respect? 


PROCEDURE 
Response Styles 


Couch and Keniston (1960) have shown a signifi- 
cant correspondence between average level of re- 
sponse to items rated on a seven-point scale and the 
number of “true” responses to MMPI items. This 
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s tha ł set to agree or dis- 
mey meee nee ae cutee ‘of respond- 
aera Ci pei e, four different R styles may be 
ing. IP this nme RE l of individual sub- 

i fined based on the level of indiv 1 
readily de If-rating inventory. These in- 
jects’ responses to a self-rating yY- 7 
OF reflect the three response sets of “major curren 
Gre st, namely, the set to agree, to disagree, and to 
ae ESOB that correlate positively with per- 
re d social values associated with items, and a 
fourth style which is the opposite of the latter. With 
a self-rating inventory containing items judged de- 
sirable (D) and other items judged undesirable (U), 

a two-way classification of scores in terms of level 
(high-low mean ratings) and item desirability (D-U) 
can serve to define the four R styles. A low score on 
either D or U scales was defined as one below | the 
median of the distribution of scores for all subjects 
combined. A high score was defined as lying above 
the median of combined distributions. Subjects who 
rate low on both D and U scales are showing a tend- 
ency to deny or disagree irrespective of content and 
perceived social values of items, Subjects rating high 
on both D and U scales are exhibiting an R style 
indicative of an agreeing set. Subjects who rate high 
on D and low on U are, by definition, manifesting a 
social desirability R style. The fourth style evolves 
logically from the two-way classification of scores. 
Subjects rating low on D and high on U scales would 
fall in this class. Logically this R style can be de- 
scribed as the antithesis of a Social desirability set, 
a social undesirability set. 


Self-Description Inventory 


The self-description inven 
items with heterogeneous 
characteristics of common c 
anxiety, hostility, sexual a 
adjustment, dependency, etc. This content is phrased 
in the first person with nontechnical language, ie, 
“I have trouble getting along with people.” Subjects 


are instructed to rate themselves on these items by 
means of a nine-point scale with anchoring state- 
ments ranging from “not at all like me” to “beyond 
question very much like me.” 

Twenty-seven of the items wi 
dependently rated as slightly to 
(mean ratings of less than five 
scale of D-U) were selected in c 
different study, Wahler (1958), 
ability to discriminate mental 
nonpatient groups at the .05 or 
cance. These 27 items comprise 
used in classifying R styles. Eight 
judged slightly to highly desirable 
greater than 5) make up the D tr. 
score for each subject is the mean 
on the U items and the D score is 
self-ratings on the D items. 


MMPI Scales 


Three MMPI scales were selected as measures in 
this study since they contain a variety of content 
and require a true-false mode of responding which 


tory (SDI) contains 44 
content Pertaining to 
linical interest such as 
djustment, interpersonal 


hich had been in- 
highly undesirable 
with a nine-point 
onjunction with a 
on the basis of their 
hygiene intake and 
less level of signifi- 
the U traits scale 
items which were 
(mean ratings of 
aits scale. The U 
of his Self-ratings 
the mean of his 
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differs from the self-rating approach with the SDI. 
Norma Besch and James Taylor kindly made data 
collected by them available to the author. Besch had 
71 male undergraduates at Ohio State University 
rate 200 MMPI items for “personal desirability” on 
a nine-point scale; the Pt and L scales were included 
among the items. Taylor obtained social desirability 
ratings from 81 adult normals on 205 MMPI items 
which included the K scale, From these ratings it was 
evident that items making up the K, L, and Pt scales 
are perceived as primarily undesirable. Seventy-three 
Percent of the K items were judged undesirable. The 
22 undesirable K items had a mean median rating of 
3.79 ona nine-point scale, Eighty-one percent of the 
48 Pt items were judged undesirable with a mean 
mean rating of 3.28 for the 39 undesirable items. 
Eighty percent of the L items were judged U with a 
mean mean rating of 4.13. Scores from each of these 
scales accrue mainly from responses in one direc- 
tion. The K score increases with the number of items 
denied except for 8 out of 30. The magnitude of the 
L score is also a function of denying items in all 
cases but 3 out of 15, Py scores, on the other hand, 
are mainly a function of the number of items claimed 
or agreed with; this is true for 39 out of 48 items. 
Furthermore, the item Overlap is negligible among 
these three scales with only one common item (J-51) 
scored in the same direction on the K and L scales. 


Subjects 

The nonpsychiatric subjects consisted of 26 male 
and 44 female sophomores taking an introductory 
Psychology course and 39 male students taking an 
elementary Personality course at the State Univer- 
sity of Iowa. The SDI and a “Biographical Taveni 
tory” containing 240 MMPI items were agministen 
to these people in groups. These people were tol 
that their responses were to be studied as part of @ 


Fesearch project and would not be used in any per 
sonal way, 


Two different 


nEn subjects 
subpopulations of clinical subject 
were sampled o; 


n the assumption that outpatients 
who voluntarily seek help at a mental hygiene clini 
are more likely to describe their characteristi\* 
frankly (as they conceive them) than are hospita s 
ized patients who as a group often tend to exhibit 
more severe pathology manifested by extreme °- 
ality distortion and resistive defenses exhibited $ 
marked denial or hegativism aih who frequent 
have been Pressured by relatives and/or comma 
to consign themselyes to conditions which they Rio 
not like and/or believe they don’t need and hope 

leave by appearing “normal.” 


The Mental Hygiene Clinic (MHC) subjects Che 
sumed more frank) consisted of 47 males ablems 
Process of applying for help with personal prey aie 
at a Veterans Administration Mental Hygiene Cli 


48. 
o 

eir mean age was 32.6 with a range of 22 t ure 
All tests Were admini Pahi 


ministered individually in the C° is 


of th iven this grouPi “p 
U ste te bis e SDI was given e 


Score is based on 5 of the 8 desirable items. 


N 
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MEAN Scores or Grourvs oN MMPI Scates AND SELF-RATINGS 


SDI MMPI 
Subjects N D U K Pt PA 
Student groups combined (A) 109 5.15 Si 15.99 13.06 2.85 
Š: 3. 2.85 
MHC (B) 47 5.61 4.94 12.55 21.36 3.47 
VAH" (C) 75 5.47 3.23 19.67 13.92 6.08 
Pairs of scores compared** t p t p t p t $ i p 
A:B 2.5 .025 84.001 4.5 .001 5.7 .001 ns 
Ae 2.1 .05 ns 3.8 .001 ns 6.6 .001 
B:C ns 7.4 001 65 001 3.6 .001 48 .001 


a All MMPI Scores for VAH group based on N of 24. 


** F tests over all column means were all significant at the .025 or less level, 


The hospitalized. (VAH) group (assumed more 
guarded) consisted of 75 patients, 65% of whom 
were diagnosed as some type of schizophrenic; the 
remaining were diagnosed as other types of func- 
tional psychiatric disorders. Their mean age was 34.4 
with all but two cases within the range of 18 to 49. 
The two exceptions were 60 and 61 years of age. All 
subjects were administered the SDI individually dur- 
ing admission testing. Tests other than the SDI were 
administered at the discretion of the examiners; only 
24 were given the card form of the MMPI. 


RESULTS 
The assumption that patients seeking help 
at an outpatient clinic respond more frankly 
(less defensively) than hospitalized patients 


TABL 


CONTINGENCY ANALYSIS OF FREQUENCY O; 


as a group was well substantiated by the rela- 
tive levels of mean Pt, K, and L scores of 
these groups in comparison with each other 
and controls as shown in Table 1. MHC sub- 
jects had significantly higher Pé scores than 
controls or hospitalized patients while the 
mean Pt of VAH subjects did not differ sig- 
nificantly from controls. To the extent that 
the Pé reflects psychiatric symptomatology, 
the hospitalized patients claimed no more 
symptoms than did students. Thus, the VAH 
group qualifies as a clinical sample that is not 
differentiated from controls by a scale reflect- 
ing psychiatric pathology while the MHC 


E 2 
F Four R Srytes SHOWN BY GROUPS 


R styles Socially Socially 
denying undesirable desirable Agreeing 
Group Low D-Low U Low D-High U High D-Low U High D-High U 
VAH 21 (28) 9 (12) 24 (32) 21 (28) 
MHC 3 (6) 19 (40) 1 (2) 24 (51) 
Student (F) 18 (42) 8 (18) 15 (34) 3 (7) 
Student (M) 18 (28) is (23) 16 (25) 16 (25) 


Over-all x? = 51.4, p < .001 


Group comfy 
VAH: MHC, x? = 33.8, p < .001 
VAH:F, x? = 8.4, .02 < p < .05 
VAH:M, x? = 3.3, p > 30 


parisons 


Note.—vValues in parentheses are percentages. 
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TABLE 3 


MEAN PL Scores oF Groups CLAssep by R STYLES 


R styles Socially Socially , 
denying undesirable desirable Agrecing 
Group M N M N M N M N P df b " 
TO 4 
7 7 3 2/20 ns 
12.2 8 1 HF 9 20.7 6 1s 
Me 60 3 20.8 19 240 24 54 2/8 01 
ae (F) 11.1 18 15.5 8 8.7 15 Ist 3 4.8 3/40 in 
Satan (M) 9.7 18 16.0 15 11.0 16 20.1 16 11.1 3/61 .00 
F 0.8 2.0 0.8 0.8 
a 3/43 2/39 2/37 3/45 
p ns ns S 


group was so differentiated. Also, the VAH 
patients scored significantly higher than MHC 
or student subjects on K and L indices of 
defensiveness. 

The most striking difference in frequencies 
of R styles is exhibited by the MHC group; 
as shown in Table 2, 91% of the MHC sub- 
jects had socially undesirable or agreeing 
styles. The over-all differences in frequen- 
cies within Table 1 are highly significant (p 
< .001). Comparisons of groups in pairs show 
that the MHC group differs significantly from 
all others. Frequencies of R styles shown by 
the VAH group did not differ significantly 
from those of control males but did differ 
significantly from those of females (.02 < b 
< 05). 

When these frequencies are grouped in 
terms of high or low scores on U, disregard- 
ing D, the group differences remain signifi- 
cant. When the frequencies are classed in 


terms of high or low scores on D, disregard- 
ing U, differences between groups are no 
longer significant. 

Subjects from the different nostic group 
were formed into subgroups based on the I 
styles they showed in responding to the SDI. 
Mean Pt, K, and L scores were computed for 
each of these subgroups. In Table 3 it me 
be seen that the mean P¢ scores within an 
column are quite comparable for members R 
different diagnostic groups with the same ; 
Styles. None of the F tests for significance © 
differences within columns attained signifi- 
cance at the .05 level. shin 
When row means are compared Grihi 
diagnostic &roup across R styles) it is RA 
dent that within each of the four ecm 
groups, subjects who had socially undesiral™ 
or agreeing R styles obtained higher mean a 
Stores than did those with denying or a 
desirability styles, Differences between thes 


TABLE 4 


MEAN K Scores or Grou 


l PS CLASSED BY R STYLES = 
R styles Socially Socially - 
denying undesirable desirabl A ej 
ede Agreeing 
Group M N M N M N M N I df l4 
VAH 19.8 8 1 214 9 16.7 6 1.54 2/20 ns 
MHC 17.3 3 13.0 19 1 118 24 28 2/33. m. 
Student (F) 18.3 18 mo B G: “5 17 3 4.1 3/40 0 
Student (M) 18.6 18 13.5 15 16.9 16 128 16 74 3/61 .001 
Vi 4 A 7.0 2.4 
df 3/43 2/39 2/37 3/45 
b ns ns -01 ns 


= bP- S a Ee" 
{Ī— y 
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TABLE 
Mean L Scores or Groves CLASSED BY R STYLES 
R styles Socially Socially 
denying undesirable desirable agrecing 
Groups M N M N M N M N F d p 
VAH 56 8 I 64 9 5.8 6 0.1 2/20 us 
MHC 63 3 33 J9 3.2 24 3.6 2/43 .05 
Student (F) 3.2 18 32 $ 3.5 15 30 3 0.1 3/40 ns 
Student (M) 2.8 18 RI 15 24 16 24 16 0.2 3/61 ns 
F 6.0 0.5 8.7 2.8 
df 3/43 2/39 2/37 3/45 
p 1 ns .001 ns 


means for groups with significant F tests 
across R styles were significant at the .05 or 
less level, as determined by ¢ tests of pairs of 
means, with one exception where the P? mean 
in the socially undesirable category was not 
significantly different from that in the deny- 
ing category for female students. None of the 
mean Pt scores of subjects exhibiting denying 
or social desirability R styles differed signifi- 
cantly; this was also true for subjects show- 
ing socially undesirable or agreeing styles. 
The same analyses were computed for K 
s. In this case, there were no significant 
differences within R styles (columns) except 
in the social desirability category. Here, the 
VAH group obtained significantly higher K 
scores than did either of the student groups. 
In Table 4 it may be seen that subjects who 
had socially undesirable or agreeing R styles 
irrespective of diagnostic group membership 
obtained lower mean K scores than did sub- 
jects in the social desirability or denying cate- 
gories. Again, ¢ tests of differences between 
combinations of K means taken two at a time 


score: 


for the two student groups (groups with sig- 
nificant Fs across R style categories) yielded 
significant differences at the .05 or less level 
between K means obtained from subjects with 
denying or social desirability styles and those 
with socially undesirable or agreeing sets. 

Comparable analyses of the mean L scores 
showed different trends than were found with 
K or Pt. It may be seen in Table 5 that, in 
general, VAH patients have the highest L 
scores irrespective of R styles except for the 
small number of MHC cases who exhibited 
the denying style. There were no significant 
differences between the two student groups 
either across sex classifications or across R 
style categories. The mean L scores of MHC 
cases in the socially undesirable and agreeing 
categories do not differ significantly from 
those of students. The VAH group had a sig- 
nificantly higher mean L than either student 
group in the social desirability and denying 
categories; the MHC L was also significantly 
higher than mean L scores of students in the 
latter category. 


TABLE 6 


CORRELATIONS AMONG MMPI AND Setr-Ratinc Scores OF GROUPS 


Scale 
Group N KEL K: K:U KD PEL PtU PED LU LD WD 
VAH 24 «50%  —.50* s 35 26 45 aie ae 
MHC 47 31*  —.57* —.28 30 2 4B* 13 04 
Student (F) 44 .31* —.63* OL —.37* S8* 3e —.22 19 ai" 
Student (M) 65 27%  —.75* 06 —24* .72* 24* -—16 —.18 Kr 


a N = 75 in this instance for VAH group. 
* Significant at .05 or less level of significance. 
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The intercorrelations of K, L, Pt epi 
and mean U and D ratings are presen ed in 
an ble 6. Nineteen out of the 24 correlations 
ce scales composed mainly or entirely a 
undesirable items (K, L, Pt, and A ie A 
tistically significant at the .05 or be os 
and in all instances except one ( :U) the 
coefficients are in the same direction across 
groups. The magnitudes and directions of 
these coefficients indicate that subjects tend 
to deny or claim socially undesirable traits 
with some degree of consistency on different 

scales. The mean ratings on desirable items 

were correlated with scores from the scales 
comprised of undesirable items; only 5 out of 
the 16 coefficients were significant and the 
directions of the coefficients across groups 
were not consistent. 


Discussion 


The assumption that a “frank” clinical 
group should exhibit significantly different 
proportions of R styles relative to 
Was supported by the findings. It 
found that the variou 
about the s 
Control gro 
that could 
psychiatric 


controls 
was also 
s R styles occurred with 
ame frequency in a comparable 
up and a group of clinical cases 
not be discriminated by a scale of 
symptomatology. Both of these 
findings are in accord with Jackson and Mes- 
sick’s conclusion that the major commo 
tors in personality inventories are int 
able in terms of response style, 

From their conclusion it wo 
dicted that subgroups of subj 
ent clinical and nonclinical gr 
the same R styles on one que: 
have comparable scores o 
For two of the MMPI scai 
to be the case; with the L 
subjects scored consistently 
subjects, irrespective of R 
(1959), in comparing female schizophrenic 
patients with controls, also found that high 
L scores characterized his samples of hos- 
pitalized patients. . . 

It was found that some scales with different 
content reflect the bias of the respondee’s R 
style in a consistent direction, but not others. 

The correlational analyses provide evidence 
that members of all groups tended to respond 
with some degree of consistency to scales com- 


n fac- 
erpret- 


uld also be pre- 
ects from differ- 
oups who exhibit 
stionnaire should 
n different scales, 
les this was found 
scale, hospitalized 
higher than other 
styles, Eichman 
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posed of socially undesirable items. Correla- 
tions of the various scales with mean self-rat- ¢, 
ings on D items failed to show any consistent 
trends and the majority of the coefficients : 
were not significant. These findings are in opa 
position to an assumption that a general a 
quiescence set may operate independently a 
the judged desirability or undesirability a 
items. In general, the above findings consti- - 
tute or suggest interactions. 

It was not possible in this study to use & 
four factor analysis of variance design Perata | 
the limited number of clinical subjects ae 7 
it infeasible to discard cases in order A 
achieve the proportionality of scores re age 3 
for such analyses. However, four factors a A 
represented in the more fragmented series ia 
analyses: (a) samples from different Pa 5 
populations (this group’s factor is eep 
with situational factors); (b) different re a 
and two Categorical variables used in ae K 
fying subjects; (c) high or low scores (C en 
ing or denying tendencies) on self-ra = 
scales; and (d) Classifications of items as o 
cially desirable or undesirable on the basis 
independent ratings, + this 

The essential R style factor found in 


A 
A 9 
study was what DeSoto and Kuethe a 
term the “symptom-claiming set”—a disp 
] 
| 


tion on the part of subjects to claim (or deny) 
undesirable symptomatic traits. But Ee 
tor is hardly pure since it interacts with differ- 
—which may in turn be a function af Lind- 
ent types of content. A Type I design d ance 
quist, 1953) was used to test the Sen in- 
of the scales by R styles interaction; amet 
teraction was Significant (p < .001) “il sub- 
ing MMPI Scores on K, L, and Pt for atte on 
jects classed as denying undesirable tra n un- 
self-ratings and subjects tending to can this 
desirable traits, It was also possible wit jnter- 
material to analyze the group by scales t less | 
action, which likewise was significant OF 
than the 001 level. Analyses of ao des 

Styles across groups showed an ite more 4 
Sirability by groups interaction. While e ot 
elegant complete factorial analysis Woward 
Possible, the evidence clearly points yari- 
multiple interaction effects among the 
ables considered, finding’ 

the general implications of these 1” cit 

are borne out by additional and more ê 


evidence, Jackson and Messick’s statement 
might need rephrasing. For example, it might 
have to be changed to read “it seems likely 
that the major common factors in personality 
inventories of the true-false or agree-disagree 
type . . . are interpretable primarily in terms 
of . . .” R styles which in turn are functions 
of interacting variables such as social desir- 
ability of items, certain specific types of item 
content, differential characteristics of the sub- 
populations sampled, and the circumstances 


under which subjects are tested. 


SUMMARY 


Several implications follow from the propo- 
sition that the major common factors in per- 
sonality inventories are interpretable mainly 
in terms of response styles. Among them are 
the following three hypotheses: (a) Groups 
which differ significantly in terms of scores 
on a scale of symptomatology should also 
differ in the frequency with which various R 
styles are shown by the members, and that 
groups which cannot be differentiated on the 
basis of such scores should not exhibit sig- 
nificantly different frequencies of R styles. 
(b) Subjects who exhibit the same R styles 
on one instrument should have similar scores 
on different scales even though they are mem- 

~ bers of different subpopulations. (c) Subjects’ 
R sets should be manifested in a consistent 
direction on different scales. A question may 
also be raised as to whether R sets operate 
independently of or interactively with the 
social desirability values associated with items 
comprising scales. 

The findings show that a group of “frank” 
outpatient subjects exhibit a significantly dif- 
ferent frequency of R styles than controls and 
“guarded” patients. A group of “guarded” 
hospitalized patients who scored at the same 
level as controls on a scale of symptomatology 
could not be discriminated from controls by 
differential frequency of R styles. Subjects 
from different diagnostic groups classed ac- 
cording to the R styles they showed on self- 
ratings had similar mean Pt and K scores 


E Sa S: 


P D 


Response Styles 


539 


within each R style category but not similar 
L scores. Subjects’ tendencies to deny or claim 
undesirable characteristics were exhibited rela- 
tively consistently in terms of mean scores on 
self-ratings and the Pf and K scales. Different 
diagnostic groups responded differently to the 
L scale irrespective of R styles. Scores on 
scales comprised of items describing undesir- 
able traits were found to covary in a consist- 
ent direction. No such consistency was found 
when these scores were correlated with a scale 
composed of desirable items. 

Consistent response characteristics inter- 
pretable as R sets or styles were found, but 
it was inferred from the findings that multiple 
interaction effects are likely among variables 
such as (a) the particular type of R style 
shown, (b) subpopulation differences, (c) de- 
sirability and undesirability of items com- 
prising scales, and (d) other scale differences 
such as types of content. 


REFERENCES 


Coucn, A., & Keniston, K. Yeasayers and naysay- 
ers: Agreeing response set as a personality variable, 
J. abnorm. soc. Psychol., 1960, 60, 151-174. 

Cronpacu, L. J. Response sets and test validity. 
Educ. psychol. Measmt., 1946, 6, 616-623. 

Cronpacn, L. J. Further evidence on response sets 
and test designs, Educ. psychol. Measmt., 1950, 10, 
3-31. 

DeSoto, C. B., & Kuere, J. L. The set to claim 
undesirable symptoms in personality inventories. 
J. consult. Psychol., 1959, 23, 496-500. 

Epwaros, A. L. The relationship between the judged 
desirability of a trait and the probability that the 
trait will be endorsed. J. appl. Psychol., 1953, 37, 
90-93. 

Ercuman, W. J. Discrimination of female schizo- 
phrenics with configural analysis of the MMPI 
profile. J. consult. Psychol., 1959, 23, 442-447, 

Jackson, D. M., & Messicx, S. Content and style in 
personality assessment. Psychol. Bull, 1958, 55, 
243-252. 

Linpguist, E. F. Design and analysis of experiments 
in psychology and education. Boston: Houghton 
Mifflin, 1953. 

WAHLER, H. J. Social desirability and self-ratings of 
intakes, patients in treatment, and controls, J, con- 
sult. Psychol., 1958, 22, 357-363. 


(Received October 12, 1960) 


j ing Psychology 
al oj Consulting P ch 
1361 Vol. 25, No. 6, 540-542 


THE CLINICAL UTILITY OF “INVALID” 
MMPI F SCORES 


MALCOLM D. GYNTHER 
Washington University Medical School 


Many investigators (e.g., Astin, 1959; Gil- 
berstadt & Duker, 1960; Rempel, 1958) 
eliminate MMPI profiles containing high F 
scores from analyses on the grounds that such 
profiles are not valid. This procedure is con- 
sistent with early injunctions to omit such 
profiles from research work (Hathaway & 
Meehl, 1951), but inconsistent with more 
recent views or experimental findings which 
emphasize the characterological implications 
of scores on the validity scales (Gough, 1956b; 
Gross, 1959), rather than test taking attitudes 
as such. Determination of the relationship be- 
tween high F scores, diagnostic classification, 
and aggressive versus passive criminal behay- 


ior would seem to be helpful in demonstrating 


whether such “invalid” F scores haye any 


predictive value, 


METHOD 
Test data of all 353 white male cou 


(all cases test- 
e profiles given 
ause this num- 


analysis as a separate 
category. The remaining 246 cases were sorted into 
subsamples according to the diagnostic impression of 
the psychiatric staff, which was not based on the 
MMPI data. This procedure yielded 194 behavior 
disorders (BD), 29 neurotics (N), and 23 psychotics 
(P). Intelligence estimates in the form of Kent-EGY, 
Scale D scores were available for all cases, Means 
for the BD, N, and P groups were 28.16, 27.14, and 
27.43, respectively, (The average Tange is 24-31, in- 
clusive.) Statistical analyses revealed no significant 
differences between the groups which Suggests that 
whatever differences there are between distributions 
of F scores cannot be attributed to differences jn 
intelligence. Mean ages for the BD, N, and P groups 
were 30.31, 39.96, and 37.83 years, respectively. Sta- 
tistical analyses showed that the BD group is sig- 
nificantly younger (p < .01) than cither of the other 
groups. 


Different investigators use different F values as a 
basis for discarding data. Sometimes the reader ix 
only informed that cases were removed because the 
validity scores were “high” (e.g, Rosen, 1958; Sop- 
chak, 1958), but in most cases the exact cating 
score is given (eg., Goodstein & Dahlstrom, 1 J 
Panton, 1958). In this investigation, high was de- 
fined as F > 16 raw score points, following the 56). 
ommendation of Gough (1956a) and Meehl (1956). 


RESULTS AND Discussion 


Table 1 shows the distribution of F scores 
for the BD, N, and P groups. Mean F scores 
were 8.66, 6.76, and 8.04, respectively. Sta- 
tistical analyses revealed no significant differ- 
ences which implies that differences in a 
total distribution of F scores cannot be ae 
tributed to differences in diagnostic minal 
tion. However, there were striking difference 
betw ine 
tion of F scores greater than 16. ahd ge 
Scores fell into this invalid category, with d 
of these being given by individuals ale 
as behavior disorders. Percentages of F > e 
“cores for the BD, N, and P groups m 
19.1, 0, and a respectively. Chi square, a A 
rected for continuity, showed that these G 
ferences depart significantly from chance \X 
= 6.04, dj = 2, $ < 05). the 

The significant age differences between dit 
groups raise the question of whether pel 
ferences in F > 16 distributions might seal 
explained by the age differences alone. A” 
sis of the younger and older halves of the A 
Sroup showed that the younger subsamP 
gave F > 16 Score. more frequently hatals0, 
older men (= 5.64; df = 1, p < .02). otic 
the two FS 16 Scores found in the psyr is 
Sroup were given by 22-year-old men. a 
obviously an important factor. Howe¥ is 
the mean age of the BD group is gern 
(by eliminating every other subject 30 Y 
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old or younger) so that it is not significantly 
different from the P group, the BD group still 
had twice the percentage of F > 16 scores 
than the P group (25/141 or 17.7% versus 
8.7%). 

It would appear that invalid MMPI F 
scores can discriminate between diagnostic 
classifications. That is, in groups of male 
court referrals matched for age and intelli- 
gence, behavior disorders obtained 67% of 
the F > 16 scores, psychotics 33%, and neu- 
rotics 0%. And, if one were to consider only 
individuals 23 years of age or older, 100% 
of the F > 16 scores would be obtained by 
behavior disorders. 

These court-referred individuals differ from 
psychiatric patients-in-general in that they 
all have a reason for “faking bad”: to de- 
crease the probability that they will be con- 
victed of the crimes with which they are 
charged. It would be worthwhile to replicate 
this study with routinely admitted psychiatric 
patients to see if our results are substantiated 
by a group with less reason for dissembling. 
One check for dissembling in these court-re- 
ferred subjects is to test the hypothesis that 
faking bad is positively related to the severity 
of the crime with which the person is charged. 


Analysis of the data given by murderers and 
“rapists (N = 31) does not support the hy- 


pothesis, since the percentage of F > 16 
scores given by this subgroup with the most 
serious charges against them is nearly equiva- 
lent to the percentage obtained by the re- 
mainder of the sample (16.1 versus 15.8). 
Furthermore, the dissembling interpretation 
does not account for the differential F > 16 
results found with the different diagnostic 
classes. 

With regard to the characterological inter- 
pretation of the F scale, it is interesting to 
note that Leary (1956) considers F to be a 
measure of the aggression and sadism to be 
expected in interpersonal relations. Thus, the 
higher the elevation on F, the more cruel and 
unkind the individual is predicted to behave. 
From that point of view, our subjects who 
obtained F > 16 scores would be considered 
as more aggressive in an antisocial manner 
than the remainder of the sample. Analysis of 
the #> 16 scores obtained by those who 
Committed aggressive crimes (i.e., stealing, 


TABLE 1 
FREQUENCY DISTRIBUTION oF Raw Scores ON THE 
MMPI F Scare ror BEHAVIOR DISORDER, NEUROTIC 
AND PSYCHOTIC Groups 


Raw score BD N p 
0 0 1 

4 0 0 

8 0 0 

2 0 0 

10 0 0 

7 0 1 

6 0 0 

14-16 8 2 2 
11-13 11 3 1 
8-10 19 8 5 
5-7 25 + 1 
2-4 66 10 10 
0,1 28 2 2 
N 194 29 23 


rape, murder, etc.) versus those who com- 
mitted passive crimes (i.e., forgery, breach of 
trust, drunken driving, etc.) shows that there 
is a tendency for the aggressive criminals to 
obtain F > 16 scores more frequently than 
the passive criminals (x? = 3.04, df = 1, 05 
< p < .10). 

The interpretation of MMPI F scores as 
indicating aggressiveness also casts some light 
on the differential F > 16 scores by diagnosis. 
Neurotics, who obtained no F > 16 scores, 
tended to commit passive or asocial crimes, 
whereas the behavior disorders and psychotics 
tended to commit antisocial crimes. An illus- 
tration may clarify this point. Of the 14 sex 
crimes committed by neurotics, 8 were incest, 
3 were rape, and 3 were “Peeping Tom.” 
With regard to the 42 sex crimes committed 
by behavior disorders and psychotics, 21 were 
rape or attempted rape, 7 were lewd acts on 
children, 6 were indecent exposure, 5 were 
incest, 2 were sending obscene letters to 
women, and 1 was “Peeping Tom.” This lat- 
ter group would appear to contain a far 
higher percentage of aggressive acts against 
society than the neurotic group, which is con- 
sistent with the differential F > 16 findings 
and the interpretation of F as an indicator of 
aggressive and sadistic behavior. 
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SUMMARY 


r investigated the relations be- 
a in MINT F scores, diagnostic 
tw and aggressive versus passive criminal 
ane. ine if F > 16 scores, which 
behavior to determine if F he MMPI 

lly lead to elimination of the D 

TREN analysis of the data, have any pre- 
Sg MMPIs were available 
pase 246 white male court referrals who were 
classified as behavior disorder (N = 194), 
neurotic (N = 29), p psychotic (N = 23) 
sychiatric staff. 
by aeae of the 246 subjects obtained 
F > 16 scores. Thirty-seven of these 39 de- 
viant scores were obtained by behavior dis- 
orders. When the data were adjusted to equate 
the groups on age and intelligence, behavior 
disorders were shown to have 67% of the 
F > 16 scores, psychotics 33%, and neurotics 
0% of such scores. It was also demonstrated 
that younger men more frequently obtain in- 
valid F scores than older men. Although all 
the subjects had reason to dissemble, the re- 
sults seem most consistent with a charac- 
terological interpretation of the F scale, 

The practice of discarding MMPT data be- 
cause of invalid F scores seems highly ques- 
tionable, especially if the investigator wishes 
to draw valid conclusions about groups, such 
as behavior disorders, who are likely to dis- 
play aggressive, antisocial actions. 
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INTERACTION OF BRAIN INJURY WITH PERIPHERAL 
VISION AND SET 


HAROLD L. WILLIAMS, CHARLES F. GIESEKING, ann ARDIE LUBIN 
Walter Reed Army Institute of Research 


On tests such as Kohs’ Block Design, the 
Bender-Gestalt, or Benton’s Memory for De- 
signs where designs have to be reproduced, 
the phenomenon called rotation frequently oc- 
curs. The subject. reproduces the design cor- 
rectly, but tilts it at an angle to the target 
design, sometimes as much as 45° to 90°. 
Rotation has been observed frequently in 
brain injured patients, in children, and in 
dull normals (Bender & Teuber, 1948; Gold- 
stein & Scheerer, 1941; Hanvik & Anderson, 
1950; Pascal & Suttell, 1951). 

The Block Design Rotation Test (BDRT*), 
devised by Shapiro (1951), was used in a se- 
ries of studies by Shapiro (1951, 1952, 1953) 
and Yates (1954) to show that: geometric 
properties of the target design had a signifi- 
cant effect on rotation, intelligence correlated 


-negatively with rotation, reducing peripheral 


vision increased rotation in normal subjects. 
Williams, Lubin, Gieseking, and Rubinstein 
(1956) confirmed these findings but found 
that much of the rotation variance was ac- 
counted for by intelligence. This effect was 
so strong that dull normal subjects could not 
be discriminated from brain injured on the 
basis of their rotation scores. In addition, 
they found an interaction between brain in- 
jury and peripheral vision; restricting pe- 
ripheral vision did increase rotation for con- 
trols but decreased rotation for the brain 
injured. 

This paper describes two experiments. The 
first experiment demonstrates the existence of 
an interaction between instructions and brain 
injury; calling attention to tilt in the repro- 
duced design benefits normal subjects more 
than brain injured. The second experiment 


1JIn the BDRT the subject uses four blocks taken 
from the Wechsler-Bellevue Block Design subtest to 
reproduce blue and yellow designs, 1 inch square, 
Painted on a white card, 6 inches square. 


shows that this interaction effect and the ef- 
fect due to reduced peripheral vision can be 
replicated, and that the two interactions can 
be combined to discriminate dull normals 
from brain injured. 


EXPERIMENT 1: EFFECT oF No-Tirt 
INSTRUCTIONS 


In all previous studies using the BDRT, 
the standard Wechsler-Bellevue Block Design 
instructions were used; i.e., the subject was 
not warned about rotation, he was simply told 
to reproduce the designs as accurately as pos- 
sible. Thus, it was not possible to determine 
how much of the greater rotation of the brain 
injured was produced by their inattention to 
tilt. 

On occasion, subjects were asked to correct 
the tilt in their completed designs. Some brain 
injured subjects were unable to do so, al- 
though they seemed to be trying. Control sub- 
jects generally had no difficulty when the tilt 
was called to their attention. This suggested 
that rotation was partly the result of inatten- 
tion, but that the brain injured subjects, in 
addition, had difficulty perceiving rotation. 

In Experiment 1, brain injured patients 
were compared with non-brain-injured con- 
trols under standard and no-tilt instructions. 
Four groups of 20 subjects were used: (a) 
brain injured with standard instructions (BS), 
(b) brain injured with no-tilt instructions 
(BN), (c) controls with standard instruc- 
tions (CS), (d) controls with no-tilt instruc- 
tions (CN). 


Subjects 


Forty male brain injured patients were selected 
from the Neurology and Neurosurgery Services at 
Walter Reed General Hospital. Table 1 shows the 
frequencies for the several types of injuries and a 
breakdown of these for approximate lateral localiza- 
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TABLE 1 


CLASSIFICATION OF BRAIN INJURED SUBJECTS Accorp- 
ee yo ‘Type AND Location oF Injury: EXPERI- 
ran MENT 1 


Location of injury 


Type of injury Left Right Bilateral N 
Skull fracture 5 1 5 10 
Gunshot wound 3 1 i 3 
Closed head injury 
Vascular disorders 4 4 2 10 
Neoplasms 4 2 - 6 
Encephalopathies, n.e.c. 7 7 

Total 14 8 18 40 


tion.” The category “encephalopathies, n.e.c” includes 
cases with encephalitis, Wilson’s disease, and menin- 
gitis. 

As can be seen in Table 1, the majority of the 
brain injured patients had relatively diffuse damage, 
difficult to localize. They were tested as soon after 
hospitalization as they were able to cooperate with 
the examiner, and understand instructions. At the 
time of testing, the length of hospitalization ranged 
from 1 to 7 months with a mean of 2.4 and a stand- 
ard deviation of 1.6. 

In a brief mental-status examination conducted 
prior to each test, the examiner judged 14 patients 
to be disoriented for time and/or place, with im- 
paired memory. Patients were accepted for the study 
if they showed in practice trials that they under- 
stood the standard instructions for the Wechsler- 
Bellevue Block Design subtest, Prior to the occur- 


rence of brain injury, all patients had been in gen- 
eral good health. 


The 40 male controls were selected 
brain-injured, nonpsychiatric patie! 
Walter Reed General Hospital. The 
signs of CNS damage on exami 
entry. A questionnaire was used t 
with a history of head injury. 

The Army Classification Battery (ACB) (Monta- 
gue, Williams, Lubin, & Giescking, 1957), 
tered at Army entry previous to illness or injury, 
was available for 30 of the brain injured patients. 
Thirty of the controls were so selected as to match, 
individually, these 30 patients on the Pattern Analy- 
sis subtest of the ACB and on the time interval be- 


from the non- 
nt population at 
y had exhibited no 
nation at hospital 
o eliminate patients 


adminis- 


2 Tables giving additional information on symp- 
tomatology, mental status, and special diagnostic 
studies for each patient are filed with the American 
Documentation Institute. Order Document No. 6871 
from ADI Auxiliary Publications Project, Photodupli- 
cation Service, Library of Congress; Washington 25, 
D. C., remitting in advance $1.75 for microfilm or 
$2.50 for photocopies. Make checks payable to: 
Chief, Photoduplication Service, Library of Congress, 
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tween first administration of the ACB and subsequent 
administration at hospital entry. Pattern Analysis 
was used because it appears to be a reliable, valid 
measure of spatial ability, relatively free from bs 
effects of education. Matching on time since initia 
testing provides some control on age and rank. There 
were no significant differences in age among the four 
groups. The ages ranged from 18 to 50 years with a 
mean of 28.4 and a standard deviation of 8.3. 


Procedure 
The 40 target cards of the BDRT were pads | 
by 1, in a single marked position on a table 52 


inches wide by 50 inches long by 30 inches on 
The surface of the table was painted a dull ~~ 
The target card was centered with respect to be 
length of the table and was 15 inches from the bs a 
ject. The subject made his block designs with a 
circle of points, 6 inches in diameter, located at 
table edge. ion to 
In the “no-tilt” groups (BN and CN) attention n- 
correct orientation was induced by instructions, ss 
onstrations of tilt, practice trials, and warnings ee 
rotation occurred during the test. The TE i 
groups (BS and CS) received standard Wechs 
Block-Design instructions, sted 
When the subject indicated that he had comple! re 
a design, his reproduction and the target design viS 
photographed with an overhead camera. Later as- 
grees of rotation from the target design were cent 
ured from the film, using a ruler and ele oi 
Two individuals made independent measures © res 
tation for each target card, and adjusted their sco ii 
after discussion of major disagreements. THe oul 
justed scores showed an average difference of a res 
2 degrees. The average of the two adjusted scor= 
Was used for each card. jects 
Prior to administering the BDRT, all 80 Ter 
were given the Arithmetic, Vocabulary, and hsler 
Design subtests of Wechsler-Bellevue. This Wee tual 
AVB combination was used to estimate intellectu? 
level at the time of the study. 
Results 
. i i s 
Figure 1 shows the four groups 1n sen 
to the M? rotation score and the Wechs 
AVB measure of intelligence, ilt in- 
€ control group receiving the no-ti tion 
structions is significantly * lower on rota! ove 
than the other three groups. If we et 
this CN group, the remaining three ST ives 
do not differ significantly on M. Table 2 & 


ards 
°R, the total degrees of rotation over the n Func- 
aS a very skewed distribution. The logarithm’ skew 
tion M = 100 [og R — 1] was used to reduce telli- 


and to reduce nonlinearity of regression On ? 

gence, P seve! 
“In this paper “significant” refers to the 0 

or better, 
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the mean and variance of M for each group 
of 20 subjects, as well as the correlation of 
M with AVB within each group. By the usual 
two-way analysis of variance of Table 2, the 
effect of instructions is significant at the .01 
level, the brain injury effect is significant at 
the .05 level. The interaction of brain injury 
and instructions is significant at the .05 level, 
only if a one-tailed test is used. 

The average M scores for the diagnostic 
classifications shown in Table 1 did not differ 
significantly, nor was there a significant later- 
ality effect. There was no significant differ- 
ence between average M scores for the groups 
judged to be oriented and disoriented in the 
mental status examination. 


Discussion 

When tilt is called to the subject’s atten- 
tion, the controls are able to reduce their ro- 
tation to about 4 degrees per card, quite close 
to the 2-degree average error measurement. 
However, the brain injured, even with no-tilt 
instructions, still average about 12 degrees of 
rotation per card. These results imply that 
most of the rotation by patients with recent 
general brain injury is due to impairment of 
perception rather than inattention. 

In previous studies we were puzzled by the 
persistent negative correlation of about —.50 
between rotation and intelligence. Clinical ob- 
servation suggested that with standard block 
design instructions brighter subjects perceived 
the importance of correct orientation. Dull 
subjects seemed less concerned about the 
proper orientation of their design. If atten- 
tion to rotation increases with intelligence, 
then the strength of the correlation between 


an 
rs 
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Fic. 1. Position of the four brain injury instruc- 
tion groups with respect to rotation (M) and intelli- 
gence, 


M and AVB should be reduced in the no-tilt 
instruction groups. As can be seen in Table 2, 
this is not true. The nature of the relation 
between intelligence and rotation remains a 
mystery. 


EXPERIMENT 2: CompBinep Errect or RE- 
DUCED PERIPHERAL VISION AND No-TILt 
INSTRUCTIONS ON ROTATION 


The purpose of this experiment is to repli- 
cate the two interactions found previously and 
to show that they may be combined to demon- 
strate a significant difference between dull 
normals and brain injured. The relation be- 
tween intelligence and rotation is such that 
dull normals and brain injured rotate about 


TABLE 2 
MEANS, VARIANCES, AND CORRELATIONS WITH INTELLIGENCE, FOR THE ROTATION 


Score M, As A FUNCTION oF BRAIN Injury AND 


STRUCTIONS 


Rotation (M) 


Correlation 
of M with 


Group Mean Variance AVB 
Brain injured, standard instructions 165.15 2,054.13 a 
Non-brain-injured, standard instructions 159.50 1,134.05 — 37 
Brain injured, no-tilt instructions 150.40 2,008.78 me 
Non-brain-injured, no-tilt instructions 114.35 558.13 ~ ‘64 


Note.—The N in each group is 20. 
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TABLE 3 


CLASSIFICATION OF BRAIN INJURED SUBJECTS ACCORD- 
18G TO TYPE AND LOCATION OF Injury: EXPERI- 


MENT 2 
Location of injury 

Type of injury Left Right Bilateral N 
Skull fracture 2 2 4 8 
Gunshot wound ih 1 
Closed head injury 9 9 
Vascular disorders 1 1 
Encephalopathies, n.e.c. 1 1 
Total 3 3 140 


the same amount. But the interactions of pe- 
ripheral vision and no-tilt instructions with 
brain injury are independent of intelligence. 
It follows that these two interactions could 
be combined to show a significant difference 
between dull normals and brain injured. 

The reasoning is as follows: Suppose rota- 
tion scores are obtained under three condi- 
tions, (a) standard instructions, (b) standard 
instructions combined with reduced peripheral 
vision and (c) no-tilt instructions with unre- 
stricted vision. Previous results indicate that 
brain injured and dull normals rotate equally 
often on Condition a. For the dull normals we 
would predict an increase in rotation from a 
to b, and a large decrease from b to c. For 
the brain injured patient there should be a 
decrease in rotation from a to b, and a slight 
drop from b to c. The difference score 
k = b — c, should show a significantly higher 
mean for the dull normals since it adds the 
absolute value of the decreased rotation for 
no-tilt instructions to the increase in rotation 
due to reduced peripheral vision, 


Subjects 


Twenty male brain injured subjects were selected 
from the Neurology and Neurosurgery Services at 
Walter Reed General Hospital. Table 3 gives fre- 
quencies for the various diagnoses. There are pro- 
portionally more bilateral cases, but in other re- 
spects the group resembles the brain injured subjects 
of Experiment 1. At the time of testing the length of 
hospitalization ranged from 1 to 9 months, with a 
mean of 2.9 and a standard deviation of 2.2, 

Twenty control patients were selected so as to 
match each brain injured subject on the Pattern 
Analysis score at time of Army entry, and on time 
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between first and second administration of the ACB. 
These controls were selected from the same popula- 
tion described for Experiment 1. A dull normal group 
was formed by selecting 20 control patients who had 
made a score of 80 or below on Pattern Analysis at 
the time of Army entry. (A score of 80 is one stand- 
ard deviation below the mean.) The mean age for 
the three groups was 26.9, the standard deviation 8.6, 
and the ages ranged from 17 to 50 years. There was 
no significant differences between the means of the 
three groups. 


Procedure 


Each subject was asked to designate his preferred 
eye, and monocular vision was used throughout. 

Let A designate the first 20 trials of the BDRT. 
Let B designate the second 20 trials of the BDRT. 
C represents 20 additional trials obtained by a don 
wise, 90° rotation of each of the first 20 BDRT 
cards. Every subject was tested in the same way 
first on A with standard instructions, then on B with 
Shapiro's field reducer,’ finally on C with no-tilt in 
structions and unrestricted monocular vision. 


Results 


Figure 2 shows the average degrees of aid 
tation per design. As predicted for B, the ie 
reducer condition, rotation increases for a 
dull normals and normals, whereas the brag 
injured show a decrease in rotation. All bee 
Sroups show decreased rotation under ee 
instructions, but as expected, the brain ™ 
jured show the smallest improvement. 


Š a 
The field reducer is a mask fashioned emt 
table tennis ball which covers the eye. It P 


aE Ee ci 
central vision through a hole about 6 millimeter” 
diameter, 
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TABLE 4 
CHANGES IN DEGREES OF ROTATION PER CARD DUE TO 
REDUCED PERIPHERAL VISION AND No-Ti.r 
NSTRU 


Mean 
Group \ B—A K=B-—C 
Normals 16.578 1.138 AQT 
Dull normals 25.460 1.768 22.037 
Brain injured 23.958 — 4.362 7.642 
Variance 

A B-A B—C 
Normals 98.485 83.933 156.130 
Dull normals 254.401 327.775 480.944 
Brain injured 135.052 60.949 206.615 


Note.—The N in each group is 20. 


Table 4 gives the basic data for estimating 
the effect of each condition. Under Condition 
A, with standard instructions and unrestricted 
monocular vision, the brain injured and dull 
normals have about the same amount of ro- 
tation. As expected, the normals have a sig- 
nificantly lower mean than the other two 
groups. 

The column labeled B-A measures the ef- 
fect of reducing peripheral vision. Both the 
normals and the dull normals show a slight in- 
crease in rotation, averaging about 1 or 2 de- 
grees per card, whereas.the brain injured have 
a significant decrease in rotation. 

The column labeled K = B — C is equal, 
algebraically, to (B-A) — (C-A) and there- 
fore is equivalent to adding the effect of re- 
duced peripheral vision and subtracting the 
effect of no-tilt instructions. The dull normals 
are significantly higher than the brain injured 
on this combined measure of interaction. 

K has a statistically significant correlation 
of .37 with the dichotomous criterion, brain 
injured vs. dull normals. The multiple regres- 
sion of the dichotomous criterion on the 
scores A, B, and C gave a multiple correla- 
tion of .42. This does not differ significantly 
from the .37 validity of K. In other words, 
the a priori function, K = B — C, is as good 
as the best empirical discriminating function. 
Neither function, however, is very useful for 
diagnostic purposes. The best (i.e., maximum 
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likelihood) cutoff point yields only about 60% 
correct classification. The average rotation 
scores for the diagnostic groups shown in 
Table 3 do not differ significantly, nor was 
there a significant difference between the av- 
erages for disoriented and oriented patients. 


Discussion 


Rotation in brain injured and normals oc- 
curs intermittently much as would be antici- 
pated from sporadic spells of inattention. Di- 
recting the attention of normal subjects to 
tilt does reduce rotation to an amount close 
to the error of measurement, but brain in- 
jured patients improve only slightly. The 
paradoxical finding that reducing peripheral 
cues causes normals to rotate more, but actu- 
ally improves the performance of the brain 
injured subject implies that relevant, pe- 
ripheral cues may cause orientation error in 
patients with brain injury. It may be in- 
ferred that there is a malfunctioning of the 
general integrating mechanism in the brain 
injured subject, such that relevant peripheral 
cues hamper performance by producing dis- 
torted perception. 

M. B. Shapiro (1952) hypothesizes that the 
greater rotation for brain injured subjects is 
due to an increase in cortical inhibition caused 
by trauma. Thus, the brain injured subject is 
rendered peripherally blind, and integration 
fails because the peripheral cues are not trans- 
mitted by the cortex. Therefore, Shapiro’s pre- 
diction would be that the field-reducer would 
have no effect on rotation for the brain in- 
jured subjects. The data of this and the previ- 
ous experiment (Williams et al., 1956) indi- 
cate, however, that the field-reducer facili- 
tates correct orientation by the brain injured. 

The results obtained by Strauss and Leh- 
tinen (1947) appear to be similar to ours. 
They state that brain injured subjects are 
easily distracted by stimuli; therefore reduc- 
ing the display to its essentials will improve 
performance. The field-reducer used in the 
present experiment may prevent peripheral 
stimuli from distracting the brain injured 
subject, thus enabling him to concentrate 
more effectively on the target. In place of 
Shapiro’s “inattention” hypothesis, we would 
substitute the notion that for the brain in- 
jured, relevant peripheral cues provide con- 
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fusing and distracting information about the 
s 
visual frame of reference. 


SUMMARY 


Experiments were conducted to confirm the 
existence of two interactions of brain injury 
and experimental conditions on block design 
rotation: (a) Instruction to pay attention to 
the tilt of the reproduced designs resulted in 
a greater decrease of rotation for both nor- 
mal and dull normal controls than for the 
brain injured. (b) Restricting peripheral vi- 
sion increased rotation for normal and dull 
normal controls, but decreased it for the 
brain injured. Although the difference in pat- 
terns of performance for dull normals and 
brain injured was statistically significant, it 
was not great enough to furnish a basis for 
individual diagnosis. 

The results from this and previous experi- 
ments imply that the basic difficulty for brain 
injured subjects is not a failure of attention 
or peripheral blindness, but is a generalized 
defect of integration such that relevant pe- 
ripheral cues cause perceptual distortion. 
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A NOTE ON TIME OF FIRST RESPONSES 
IN RORSCHACH PROTOCOLS 


ALVIN G. BURSTEIN 1 
University of Michigan 


It is conventional in preparing the sum- 
mary of a subject’s Rorschach performance 
for diagnostic purposes to compute the mean 
time of first response (T/1R) for the series 
of 10 cards. The object is to obtain a measure 
of central tendency—to estimate the “typical” 
T/1R for that subject—(a) so that a sub- 
ject’s typical T/1R can be compared with 
typical times for nosological groups and (b) 
so that the T/1R on a specific card for a 
particular subject can be compared with that 
same subject’s typical T/1R. Examples of the 
clinical value of such comparisons have been 
furnished by Beck (1949). Depressed and or- 
ganically deteriorated patients have a typi- 
cally slower T/1R than do normals (a com- 
parison of the first type mentioned), while 
neurotic shock can often be identified in 
terms of a subject’s departure from his own 


typical T/1R (a comparison of the second 


type). o 
Although the general practice is to compute 


the mean T/1R, it might be well to consider 
whether the median might not be more appro- 
priate. Since the population of response times 
can extend no lower than zero seconds but up 
to very high values, and since most response 
times are clustered near the low end, the 
population is skewed, and the mean and the 
median will not coincide. A choice between 
these two measures of central tendency could 
be based on the same arguments that favor 
the use of the median over the mean in de- 


1 The author is indebted to S. J. Beck of the 
University of Chicago and to Sheldon Korchin of 


Michael Reese Hospital for their assistance in ob- 


taining access to the normal Rorschach protocols. 


scribing the “average” American’s income; 
the sensitivity of the mean to extreme values 
makes it appear preferable to have that fig- 
ure below which half the cases fall and above 
which half the cases fall, that is, the median. 

The kind of distortion to which the mean 
T/1R may be subject is illustrated in a case 
reported by Beck (1949, pp. 281-287). In 
evaluating evidence for neurotic shock on 
several cards, Beck used, as a basis of com- 
parison not the overall mean T/1R of 65.1 
seconds, but rather a corrected mean T/1R— 
28 seconds—obtained by ignoring the three 
largest values. It should be noted that, be- 
cause it is less sensitive to extreme values, the 
median T/1R—33 seconds—could have been 
used without such correction. Since it is diffi- 
cult in such cases to make the subjective 
judgment of which and how many extreme 
values to drop, the use of the median would 
appear advantageous. 

In an effort to supply some normative in- 
formation, the median T/1R was computed 
for 154 of the 157 protocols collected as a 
normative sample by Beck, Rabin, Theissen, 
Molish, and Thetford (1950; the remaining 
three were not available at the time of the 
analysis). These protocols had a mean median 
T/1R of 25.6 seconds as compared with a 
mean mean T/1R of 32.5 seconds. 

The really critical issue in deciding which 
measure of central tendency to employ is 
what we wish to represent by that measure. 
It is characteristic of the median that it will 
represent the typical response time in the 
sense that exactly half of the subject’s re- 
sponse time will be shorter than the median 
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and half longer. As we have seen, the sensi- 
tivity of the mean to extreme values can give 
us a typical time much higher than this. It is 
suggested therefore that the median more ade- 
= quately represents the typical response time, 
and substituting the median for the mean in 
Rorschach protocols should help make the 
clinical use of the time of first response more 
meaningful. 


50) Alvin G. Burstein 
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EGO STRENGTH AND CONFLICT DISCRIMINATION: 
A FAILURE OF REPLICATION 


JACK BLOCK 


University of Calijornia, Berkeley 


Korman (1960) recently has reported a 
study wherein the latency of psychophysical 
judgments was found to be related to scores 
on Barron’s Es (ego strength) scale. Subjects 
scoring low on the Es scale were found to 
discriminate more slowly than subjects scor- 
ing high. Further, as the objective difficulty 
of decision was increased, this difference in 
latency of judgment was found to increase. 
These results were sought specifically, as one 
route to the construct validation of the Es 
scale. The general principle of validating a 
measure by relating it to a very different 
index of the underlying construct is of course 
a worthy one and it is with reluctance that 
the present note introduces data which fail 
to confirm the Korman finding. 

In a previous study (Block & Petersen, 
1955) after which the Korman experiment in 
part was patterned, latency scores for both 
easy and difficult §judgment-discriminations 
were available which are fully equivalent to 


‘the latency scores employed by Korman. Also 


available were scores for each of the 53 sub- 
jects on the Zs scale. For the easy decision 
situation and separately for the difficult de- 
cision situation, the 53 subjects were ordered 
with respect to their decision latencies. The 
Es scores of the fastest 25 subjects were then 
contrasted with the Zs scores of the slowest 25 
subjects (the intermediate three subjects be- 
ing omitted for reason of computational con- 
venience). The fast deciders in an objectively 
easy decision situation had a mean Es score of 
50.96 with an SD of 4.02; slow deciders had 
a mean of 50.88 with an SD of 5.03. The fast 
deciders in an objectively ambiguous situa- 
tion had a mean Es score of 50.72 with an 
SD of 4.22; the slow deciders had a mean 
Es score of 51.16 with an SD of 4.88. Obvi- 


ously, in this study there is no relation be- 
tween Es scores and the ability to rapidly re- 
solve discrimination conflicts. 

How may such an empirical discrepancy be 
understood? What factors may be contribut- 
ing to a finding of relationship in the one 
study and the absence of a relationship in 
the other? 

One immediately obvious consideration is 
that the samples employed in the two studies 
are radically different. The Korman study 
used 47 psychiatric inpatients, all presumably 
with sufficient internal and manifest psy- 
chopathology to warrant commitment. The 
Block-Petersen study employed Air Force 
captains, all presumably individuals within 
the normal range of adjustment. The mean 
Es score for the Korman sample was about 
41, well below the Block-Petersen sample 
mean of 51.32. The Zs standard deviation of 
the Korman sample is 6.75, somewhat but 
not significantly higher than the SD of 4.54 in 
the Block-Petersen sample. These are impor- 
tant differences for the psychological signifi- 
cance of a given Es score in the one sample 
may not correspond to its meaning in the 
other sample. Simply at the quantitative level, 
the differences in the Es means of the two 
samples suggest that the high Es scorers of 
the Korman sample were relatively low scorers 
when considered relative to the Block-Peter- 
sen sample. 

It would be presumptuous to discuss the 
comparative merits of these two samples for 
an appropriate test of the Korman hypothe- 
sis. Properly, many more samples should be 
studied so that a pattern of results and their 
converging implication may appear. It would 
seem clear, however, in this instance and in 
many more that doubtless could be recounted, 


551 


§52 


that the characteristics of the sample being 
studied must be recognized explicitly as modi- 
fying in decisive ways the relationships ob- 
served (Block, 1955). To the plea of Camp- 
bell and Fiske (1959) for “heterotrait” and 
‘*heteromethod” validity must be added the 
requirement of heterogroup validity as well. 
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A diagnostic question, to which the clinical psy- 
chologist is often called upon to contribute, is 
the differentiation of emotionally disturbed and 
brain damaged children with average or superior 
intelligence. In this situation, it is not uncommon 
to employ the discrepancy between the Verbal 
scale and the Performance scale of the WISC 
and similar indices as aids in the identification 
of brain damage. However, there is little empiri- 
cal evidence to Support these particular uses of 
the WISC as a diagnostic technique. 

The present investigation analyzed the WISC 
performances of emotionally disturbed and brain 
damaged children with respect to such charac- 
teristics as overall level and pattern of perform- 
ance, differences between Verbal and Perform- 
ance scale quotients, and differences between sub- 
test means. The emotionally disturbed group con- 
sisted of 30 children (mean age = 10.5 years) 
who had been seen in the Child Psychiatry Serv- 


1 This investigation was supported by a grant (B- 
616) from the National Institute of Neurological Dis- 
eases and Blindness, United States Public Health 
Service. The writer is indebted to Arthur L, Benton 
for aid in planning and executing the study. 

An extended report of this study may be obtained 
without charge from Vinton N. Rowley (Department 
of Psychiatry, State University of Towa; Iowa City, 
Iowa) or for a fee from the American Docu- 
mentation Institute. Order Document No. 6872 from 
ADI Auxiliary Publications Project, Photoduplication 
Service, Library of Congress; Washington 25, D. C, 
remitting in advance $1.75 for microfilm or $2.50 
for photocopies. Make checks payable to: Chief, 
Photoduplication Service, Library of Congress, 
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ANALYSIS OF THE WISC PERFORMANCE OF BRAIN DAMAGED 
AND EMOTIONALLY DISTURBED CHILDREN? 


VINTON N. ROWLEY 


State University of Iowa 


ice, University of Iowa Psychopathic Hospital, 
because of behavioral maladjustment and who 
had been diagnosed as emotionally disturbed with 
no evidence or history of cerebral disease. The 
brain damaged group consisted of 30 children 
(mean age = 10.6 years) who had been seen in 
the Pediatrics Clinic, University Hospitals, and — 
who showed unequivocal evidence of disease in: 
volving one or both cerebral hemispheres. f 

In order to provide for as precise a compari- 
Son as possible, certain restrictions were observed 
in the selection of subjects. The subjects were 
individually matched with Tespect to sex, CA, and 
WISC Full Scale IQ to minimize the effects of 
these variables on performance patterns. A mini- 
mal IQ of 83 was established in order to exclude 
defective children, 

The essential findings were: (a) there was no 
significant difference between the two groups with 
respect to either Verbal scale or Performance 
scale IQ; (b) Verbal scale-Performance scale re- 
lationships were not significantly different in the 
two groups; (c) the Profiles of subtest scores jin 
the two groups were not significantly different: 
(d) none of the individual subtest scores showed 
a significant intergroup difference. 

The general i ; 
findings is that, when overal en from these 

g , verall level of perform- 
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SOCIAL DESIRABILITY IN THE RATINGS OF INVOLVED 
AND NEUTRAL JUDGES: 


GEORGE LEVINGER 
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Research on personality rating devices has re- 
vealed a high positive correlation between prob- 
ability of item endorsement and the item’s per- 
ceived social desirability (e.g., Edwards, 1953, 
1957b). Recent studies have been mainly con- 
cerned with ascertaining the pervasiveness of this 
correlation, and with constructing instruments 
which would reduce this bias. The present report, 

_ dealing with one aspect of the Problem from a 
omewhat different Perspective, conceives of per- 
ality ratings as the reflection of “true score” 

error displacement, 
n the one hand, it is hypothesized that de- 
rable traits are truly more common than un- 
desirable ones, It is also hypothesized that raters 
will distort their ratings in a desirable direction, 


to the extent of their attraction toward the ob. 
ject rated. 


ems (Edwards, 
at a Statistically 

1 An extended report of this study ma: 
without charge from George Levinger (Western Re- 
serve University ; Cleveland, Ohio) or for a fee from 
the American Documentation Institute. Order Docu- 
ment No. 6873 from ADI Auxiliary Publications 
Project, Photoduplication Service, Library of Con- 
gress; Washington 25, D, G remitting jn advance 
$1.25 for microfilm or $1.25 for Photocopies, Make 
checks payable to: Chief, Photoduplication Service, 
Library of Congress, 

The research was part of a larger study supported 
by a grant from the National Institute of Mental 
Health. 


y be obtained 


significant level, The correlations ranged from 
08, for the clinicians’ ratings of disturbed chil- 
dren, to -83, for the school parents’ descriptions 
of themselves, Such a finding is not surprising, 
considering the lengthy socialization process in 
any human culture. It seems that an objective 
judge would tend to place almost any person 
above the neutral point of social desirability. f 

Regarding the second hypothesis, parents’ rat- 
ings of their children—and of themselves—were 
consistently more favorable than those of mie 
teachers or clinicians, The findings, while ipie 
to the correlational data mentioned above, ten 
to support the hypothesis, 

The findings be not in themselves novel. a 
their implication ig that investigations of Toms 
desirability loadings should not limit their ‘ei 
to item content alone, but also concern t one 
Selves With the nature of the judge-object re 
tionship, bly 

For example, a disturbed person will prona 
describe himself less favorably than will a t so 
disturbeq one. It would seem that this is station 
much due to the former’s different interpre per- 
of item content as to his unfavorable state 0 osi- 
Sonal Self-attraction, And, when one sok in 
tive changes in self-description among patie eflec 
By chotherapy, the scores may merely bject. 
changing attraction between subject and © 
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MEASUREMENT OF THE SEVERITY OF DISORDER IN 
BY MEANS OF THE HOLTZMAN 


SCHIZOPHRENIA 
INKBLOT TEST? 


RICHARD A. STEFFY anp WESLEY C. BECKER 


University of Illinois 


\ Elgin Prognostic Scale ratings of behavioral 
od case history data were correlated with genetic 
level ratings (Becker, 1956) of Holtzman Inkblot 
responses on a sample of 36 Veterans Adminis- 
tration, hospitalized schizophrenics. A product- 
moment r of — 36 (p< -05) supported the pre- 
diction that subjects with poorer Elgin Prognostic 
give more diffuse, undiffer- 
entiated, immature responses to inkblot stimuli, 
Although the relationship between genetic level 
and Elgin ratings was not as high as the one 
found by Becker (1956) using the Rorschach, 
differences between samples in duration of pres- 
ent hospitalization were shown to attenuate the 
relationship, Longer hospitalization leads to lower 
Elgin ratings on some scales (duration of psy- 
chosis, social withdrawal) and to improved func- 
tioning on inkblot tests (as precipitating stresses 


1 An extended report of this study may be ob- 
tained without charge from W, C, Becker (Depart- 
ment of Psychology, University of Illinois; Urbana, 
Illinois) or for a fee from the American Docu- 
mentation Institute. Order Document No. 6874 from 
ADI Auxiliary Publications 
tion Service, Library of Congress; 
D; C. i 


are removed). Partialing out the effect of dura- 
tion of hospitalization increased the correlation 
between the Elgin scale and the genetic level 
scoring of the Holtzman to — .46. 

Additional analyses of the Holtzman test were 
made to explore the limits of its potential in this 
area. Although based on a small sample and need- 
ing cross-validation, an item analysis revealed a 
best subset of 13 Holtzman cards that correlated 
— .53 (and — .64 after partialing out duration of 
hospitalization) with the Elgin scale criterion 
ratings. The best linear combination of fi 
Holtzman variables entering into the genetic lev 
Scoring system did nearly as well in predicting 
the Elgin scale as the pattern scores of the ge- 
netic level scoring system. It is concluded that 
more extensive studies of this type with Holtz- 
man test offer promise of Producing good meas. 
ures of degree of pathology in schizophrenia, 
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s LYONS RELATIONSHIP SCALES: 
A STUDY OF RELIABILITY 1 


~ ELIZABETH L. GOUCHER,? 


An important area of interpersonal function- 


This study reports the development of such scales 
and an evaluation of their reliability. 

Scales consist of two 
Relationship from the 


hey were designed to be used after an interview 
with the patient and an interview with the rela- 
tive, Each schedule consi 


the description given by the patient of his rela- 


vi a ing, interchan e of id 
adaptability, discussion of s ive of 
work, degree of Overprotecti 


affection, degree of hosti 


elicited material. 


1An extended report of this study may be ob- 
tained without charge from Emily Scanlan (Chief, 
Social Work Service, Veterans Administration Hos- 
pital; Lyons, New Jersey) or for a fee from the 
American Documentation Institute, Order Document 
No. 6871 from ADI Auxiliary Publications 


Project, 
Photoduplication Service, Library of 


Congress; 


shington 25, D. C., remitting in advance $1.75 
pa El or $2.50 for photocopies. Make checks 
payable to: Chief, Photoduplication Service, Library 

ongress. at n 
ren at the New Jersey Neuropsychiatric Insti- 
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The reliability of these instruments was as- 
sessed by studying results obtained from Sched- 
ule I, The Relationship from the Viewpoint of 
the Patient. The assumption was made that since 
the items on Schedule II were analogous to those 
on Schedule I, ratings based upon interviewing 
a relative would be at least equally as reliable as 
those based upon interviewing a patient. Thirty- 
four patients were interviewed and independently 
rated by a panel of four social workers, 


$ 
Q (Hester, 1957 unpublished) was 


1 one re- 
spondent from two of the items 


(sharing of work and consideration of patient’s 
illness) did not attain a satisfactory Q value, The 
item on adaptability was discarded since it was 
too often considered unrateable. Two others (dis- 
cussion of problems and money arrangements) — 
were borderline and were tentatively retained 
bending further study, 

S a further attempt to assess the reliability 


of the scale the Statistic « (kappa) developed by 
Cohen (1960) was utilized. One of the major re- 
Spects in which this statistic differs 


rather than being based upon dich 


s greement for all the 
scale items except for the two that did not attain 
with face validity 

» May be of value in 


J re of an important 
area of interpersona] relationships, 
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